* [PATCH 6.6 001/267] wifi: mac80211: mesh: Fix leak of mesh_preq_queue objects
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 002/267] wifi: mac80211: Fix deadlock in ieee80211_sta_ps_deliver_wakeup() Greg Kroah-Hartman
` (277 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Nicolas Escande, Johannes Berg,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicolas Escande <nico.escande@gmail.com>
[ Upstream commit b7d7f11a291830fdf69d3301075dd0fb347ced84 ]
The hwmp code use objects of type mesh_preq_queue, added to a list in
ieee80211_if_mesh, to keep track of mpath we need to resolve. If the mpath
gets deleted, ex mesh interface is removed, the entries in that list will
never get cleaned. Fix this by flushing all corresponding items of the
preq_queue in mesh_path_flush_pending().
This should take care of KASAN reports like this:
unreferenced object 0xffff00000668d800 (size 128):
comm "kworker/u8:4", pid 67, jiffies 4295419552 (age 1836.444s)
hex dump (first 32 bytes):
00 1f 05 09 00 00 ff ff 00 d5 68 06 00 00 ff ff ..........h.....
8e 97 ea eb 3e b8 01 00 00 00 00 00 00 00 00 00 ....>...........
backtrace:
[<000000007302a0b6>] __kmem_cache_alloc_node+0x1e0/0x35c
[<00000000049bd418>] kmalloc_trace+0x34/0x80
[<0000000000d792bb>] mesh_queue_preq+0x44/0x2a8
[<00000000c99c3696>] mesh_nexthop_resolve+0x198/0x19c
[<00000000926bf598>] ieee80211_xmit+0x1d0/0x1f4
[<00000000fc8c2284>] __ieee80211_subif_start_xmit+0x30c/0x764
[<000000005926ee38>] ieee80211_subif_start_xmit+0x9c/0x7a4
[<000000004c86e916>] dev_hard_start_xmit+0x174/0x440
[<0000000023495647>] __dev_queue_xmit+0xe24/0x111c
[<00000000cfe9ca78>] batadv_send_skb_packet+0x180/0x1e4
[<000000007bacc5d5>] batadv_v_elp_periodic_work+0x2f4/0x508
[<00000000adc3cd94>] process_one_work+0x4b8/0xa1c
[<00000000b36425d1>] worker_thread+0x9c/0x634
[<0000000005852dd5>] kthread+0x1bc/0x1c4
[<000000005fccd770>] ret_from_fork+0x10/0x20
unreferenced object 0xffff000009051f00 (size 128):
comm "kworker/u8:4", pid 67, jiffies 4295419553 (age 1836.440s)
hex dump (first 32 bytes):
90 d6 92 0d 00 00 ff ff 00 d8 68 06 00 00 ff ff ..........h.....
36 27 92 e4 02 e0 01 00 00 58 79 06 00 00 ff ff 6'.......Xy.....
backtrace:
[<000000007302a0b6>] __kmem_cache_alloc_node+0x1e0/0x35c
[<00000000049bd418>] kmalloc_trace+0x34/0x80
[<0000000000d792bb>] mesh_queue_preq+0x44/0x2a8
[<00000000c99c3696>] mesh_nexthop_resolve+0x198/0x19c
[<00000000926bf598>] ieee80211_xmit+0x1d0/0x1f4
[<00000000fc8c2284>] __ieee80211_subif_start_xmit+0x30c/0x764
[<000000005926ee38>] ieee80211_subif_start_xmit+0x9c/0x7a4
[<000000004c86e916>] dev_hard_start_xmit+0x174/0x440
[<0000000023495647>] __dev_queue_xmit+0xe24/0x111c
[<00000000cfe9ca78>] batadv_send_skb_packet+0x180/0x1e4
[<000000007bacc5d5>] batadv_v_elp_periodic_work+0x2f4/0x508
[<00000000adc3cd94>] process_one_work+0x4b8/0xa1c
[<00000000b36425d1>] worker_thread+0x9c/0x634
[<0000000005852dd5>] kthread+0x1bc/0x1c4
[<000000005fccd770>] ret_from_fork+0x10/0x20
Fixes: 050ac52cbe1f ("mac80211: code for on-demand Hybrid Wireless Mesh Protocol")
Signed-off-by: Nicolas Escande <nico.escande@gmail.com>
Link: https://msgid.link/20240528142605.1060566-1-nico.escande@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/mac80211/mesh_pathtbl.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/net/mac80211/mesh_pathtbl.c b/net/mac80211/mesh_pathtbl.c
index 59f7264194ce3..530581ba812b4 100644
--- a/net/mac80211/mesh_pathtbl.c
+++ b/net/mac80211/mesh_pathtbl.c
@@ -1011,10 +1011,23 @@ void mesh_path_discard_frame(struct ieee80211_sub_if_data *sdata,
*/
void mesh_path_flush_pending(struct mesh_path *mpath)
{
+ struct ieee80211_sub_if_data *sdata = mpath->sdata;
+ struct ieee80211_if_mesh *ifmsh = &sdata->u.mesh;
+ struct mesh_preq_queue *preq, *tmp;
struct sk_buff *skb;
while ((skb = skb_dequeue(&mpath->frame_queue)) != NULL)
mesh_path_discard_frame(mpath->sdata, skb);
+
+ spin_lock_bh(&ifmsh->mesh_preq_queue_lock);
+ list_for_each_entry_safe(preq, tmp, &ifmsh->preq_queue.list, list) {
+ if (ether_addr_equal(mpath->dst, preq->dst)) {
+ list_del(&preq->list);
+ kfree(preq);
+ --ifmsh->preq_queue_len;
+ }
+ }
+ spin_unlock_bh(&ifmsh->mesh_preq_queue_lock);
}
/**
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 002/267] wifi: mac80211: Fix deadlock in ieee80211_sta_ps_deliver_wakeup()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 001/267] wifi: mac80211: mesh: Fix leak of mesh_preq_queue objects Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 003/267] wifi: cfg80211: fully move wiphy work to unbound workqueue Greg Kroah-Hartman
` (276 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Remi Pommarel, Johannes Berg,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Remi Pommarel <repk@triplefau.lt>
[ Upstream commit 44c06bbde6443de206b30f513100b5670b23fc5e ]
The ieee80211_sta_ps_deliver_wakeup() function takes sta->ps_lock to
synchronizes with ieee80211_tx_h_unicast_ps_buf() which is called from
softirq context. However using only spin_lock() to get sta->ps_lock in
ieee80211_sta_ps_deliver_wakeup() does not prevent softirq to execute
on this same CPU, to run ieee80211_tx_h_unicast_ps_buf() and try to
take this same lock ending in deadlock. Below is an example of rcu stall
that arises in such situation.
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: 2-....: (42413413 ticks this GP) idle=b154/1/0x4000000000000000 softirq=1763/1765 fqs=21206996
rcu: (t=42586894 jiffies g=2057 q=362405 ncpus=4)
CPU: 2 PID: 719 Comm: wpa_supplicant Tainted: G W 6.4.0-02158-g1b062f552873 #742
Hardware name: RPT (r1) (DT)
pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : queued_spin_lock_slowpath+0x58/0x2d0
lr : invoke_tx_handlers_early+0x5b4/0x5c0
sp : ffff00001ef64660
x29: ffff00001ef64660 x28: ffff000009bc1070 x27: ffff000009bc0ad8
x26: ffff000009bc0900 x25: ffff00001ef647a8 x24: 0000000000000000
x23: ffff000009bc0900 x22: ffff000009bc0900 x21: ffff00000ac0e000
x20: ffff00000a279e00 x19: ffff00001ef646e8 x18: 0000000000000000
x17: ffff800016468000 x16: ffff00001ef608c0 x15: 0010533c93f64f80
x14: 0010395c9faa3946 x13: 0000000000000000 x12: 00000000fa83b2da
x11: 000000012edeceea x10: ffff0000010fbe00 x9 : 0000000000895440
x8 : 000000000010533c x7 : ffff00000ad8b740 x6 : ffff00000c350880
x5 : 0000000000000007 x4 : 0000000000000001 x3 : 0000000000000000
x2 : 0000000000000000 x1 : 0000000000000001 x0 : ffff00000ac0e0e8
Call trace:
queued_spin_lock_slowpath+0x58/0x2d0
ieee80211_tx+0x80/0x12c
ieee80211_tx_pending+0x110/0x278
tasklet_action_common.constprop.0+0x10c/0x144
tasklet_action+0x20/0x28
_stext+0x11c/0x284
____do_softirq+0xc/0x14
call_on_irq_stack+0x24/0x34
do_softirq_own_stack+0x18/0x20
do_softirq+0x74/0x7c
__local_bh_enable_ip+0xa0/0xa4
_ieee80211_wake_txqs+0x3b0/0x4b8
__ieee80211_wake_queue+0x12c/0x168
ieee80211_add_pending_skbs+0xec/0x138
ieee80211_sta_ps_deliver_wakeup+0x2a4/0x480
ieee80211_mps_sta_status_update.part.0+0xd8/0x11c
ieee80211_mps_sta_status_update+0x18/0x24
sta_apply_parameters+0x3bc/0x4c0
ieee80211_change_station+0x1b8/0x2dc
nl80211_set_station+0x444/0x49c
genl_family_rcv_msg_doit.isra.0+0xa4/0xfc
genl_rcv_msg+0x1b0/0x244
netlink_rcv_skb+0x38/0x10c
genl_rcv+0x34/0x48
netlink_unicast+0x254/0x2bc
netlink_sendmsg+0x190/0x3b4
____sys_sendmsg+0x1e8/0x218
___sys_sendmsg+0x68/0x8c
__sys_sendmsg+0x44/0x84
__arm64_sys_sendmsg+0x20/0x28
do_el0_svc+0x6c/0xe8
el0_svc+0x14/0x48
el0t_64_sync_handler+0xb0/0xb4
el0t_64_sync+0x14c/0x150
Using spin_lock_bh()/spin_unlock_bh() instead prevents softirq to raise
on the same CPU that is holding the lock.
Fixes: 1d147bfa6429 ("mac80211: fix AP powersave TX vs. wakeup race")
Signed-off-by: Remi Pommarel <repk@triplefau.lt>
Link: https://msgid.link/8e36fe07d0fbc146f89196cd47a53c8a0afe84aa.1716910344.git.repk@triplefau.lt
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/mac80211/sta_info.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index c61eb867bb4a7..984f8f67492fd 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -1709,7 +1709,7 @@ void ieee80211_sta_ps_deliver_wakeup(struct sta_info *sta)
skb_queue_head_init(&pending);
/* sync with ieee80211_tx_h_unicast_ps_buf */
- spin_lock(&sta->ps_lock);
+ spin_lock_bh(&sta->ps_lock);
/* Send all buffered frames to the station */
for (ac = 0; ac < IEEE80211_NUM_ACS; ac++) {
int count = skb_queue_len(&pending), tmp;
@@ -1738,7 +1738,7 @@ void ieee80211_sta_ps_deliver_wakeup(struct sta_info *sta)
*/
clear_sta_flag(sta, WLAN_STA_PSPOLL);
clear_sta_flag(sta, WLAN_STA_UAPSD);
- spin_unlock(&sta->ps_lock);
+ spin_unlock_bh(&sta->ps_lock);
atomic_dec(&ps->num_sta_ps);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 003/267] wifi: cfg80211: fully move wiphy work to unbound workqueue
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 001/267] wifi: mac80211: mesh: Fix leak of mesh_preq_queue objects Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 002/267] wifi: mac80211: Fix deadlock in ieee80211_sta_ps_deliver_wakeup() Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 004/267] wifi: cfg80211: Lock wiphy in cfg80211_get_station Greg Kroah-Hartman
` (275 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Miriam Rachel Korenblit,
Johannes Berg, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Berg <johannes.berg@intel.com>
[ Upstream commit e296c95eac655008d5a709b8cf54d0018da1c916 ]
Previously I had moved the wiphy work to the unbound
system workqueue, but missed that when it restarts and
during resume it was still using the normal system
workqueue. Fix that.
Fixes: 91d20ab9d9ca ("wifi: cfg80211: use system_unbound_wq for wiphy work")
Reviewed-by: Miriam Rachel Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240522124126.7ca959f2cbd3.I3e2a71ef445d167b84000ccf934ea245aef8d395@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/wireless/core.c | 2 +-
net/wireless/sysfs.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/wireless/core.c b/net/wireless/core.c
index ff743e1f2e2cb..68aa8f0d70140 100644
--- a/net/wireless/core.c
+++ b/net/wireless/core.c
@@ -431,7 +431,7 @@ static void cfg80211_wiphy_work(struct work_struct *work)
if (wk) {
list_del_init(&wk->entry);
if (!list_empty(&rdev->wiphy_work_list))
- schedule_work(work);
+ queue_work(system_unbound_wq, work);
spin_unlock_irq(&rdev->wiphy_work_lock);
wk->func(&rdev->wiphy, wk);
diff --git a/net/wireless/sysfs.c b/net/wireless/sysfs.c
index 565511a3f461e..62f26618f6747 100644
--- a/net/wireless/sysfs.c
+++ b/net/wireless/sysfs.c
@@ -5,7 +5,7 @@
*
* Copyright 2005-2006 Jiri Benc <jbenc@suse.cz>
* Copyright 2006 Johannes Berg <johannes@sipsolutions.net>
- * Copyright (C) 2020-2021, 2023 Intel Corporation
+ * Copyright (C) 2020-2021, 2023-2024 Intel Corporation
*/
#include <linux/device.h>
@@ -137,7 +137,7 @@ static int wiphy_resume(struct device *dev)
if (rdev->wiphy.registered && rdev->ops->resume)
ret = rdev_resume(rdev);
rdev->suspended = false;
- schedule_work(&rdev->wiphy_work);
+ queue_work(system_unbound_wq, &rdev->wiphy_work);
wiphy_unlock(&rdev->wiphy);
if (ret)
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 004/267] wifi: cfg80211: Lock wiphy in cfg80211_get_station
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (2 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 003/267] wifi: cfg80211: fully move wiphy work to unbound workqueue Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 005/267] wifi: cfg80211: pmsr: use correct nla_get_uX functions Greg Kroah-Hartman
` (274 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Remi Pommarel, Nicolas Escande,
Antonio Quartulli, Johannes Berg, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Remi Pommarel <repk@triplefau.lt>
[ Upstream commit 642f89daa34567d02f312d03e41523a894906dae ]
Wiphy should be locked before calling rdev_get_station() (see lockdep
assert in ieee80211_get_station()).
This fixes the following kernel NULL dereference:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050
Mem abort info:
ESR = 0x0000000096000006
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x06: level 2 translation fault
Data abort info:
ISV = 0, ISS = 0x00000006
CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=0000000003001000
[0000000000000050] pgd=0800000002dca003, p4d=0800000002dca003, pud=08000000028e9003, pmd=0000000000000000
Internal error: Oops: 0000000096000006 [#1] SMP
Modules linked in: netconsole dwc3_meson_g12a dwc3_of_simple dwc3 ip_gre gre ath10k_pci ath10k_core ath9k ath9k_common ath9k_hw ath
CPU: 0 PID: 1091 Comm: kworker/u8:0 Not tainted 6.4.0-02144-g565f9a3a7911-dirty #705
Hardware name: RPT (r1) (DT)
Workqueue: bat_events batadv_v_elp_throughput_metric_update
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : ath10k_sta_statistics+0x10/0x2dc [ath10k_core]
lr : sta_set_sinfo+0xcc/0xbd4
sp : ffff000007b43ad0
x29: ffff000007b43ad0 x28: ffff0000071fa900 x27: ffff00000294ca98
x26: ffff000006830880 x25: ffff000006830880 x24: ffff00000294c000
x23: 0000000000000001 x22: ffff000007b43c90 x21: ffff800008898acc
x20: ffff00000294c6e8 x19: ffff000007b43c90 x18: 0000000000000000
x17: 445946354d552d78 x16: 62661f7200000000 x15: 57464f445946354d
x14: 0000000000000000 x13: 00000000000000e3 x12: d5f0acbcebea978e
x11: 00000000000000e3 x10: 000000010048fe41 x9 : 0000000000000000
x8 : ffff000007b43d90 x7 : 000000007a1e2125 x6 : 0000000000000000
x5 : ffff0000024e0900 x4 : ffff800000a0250c x3 : ffff000007b43c90
x2 : ffff00000294ca98 x1 : ffff000006831920 x0 : 0000000000000000
Call trace:
ath10k_sta_statistics+0x10/0x2dc [ath10k_core]
sta_set_sinfo+0xcc/0xbd4
ieee80211_get_station+0x2c/0x44
cfg80211_get_station+0x80/0x154
batadv_v_elp_get_throughput+0x138/0x1fc
batadv_v_elp_throughput_metric_update+0x1c/0xa4
process_one_work+0x1ec/0x414
worker_thread+0x70/0x46c
kthread+0xdc/0xe0
ret_from_fork+0x10/0x20
Code: a9bb7bfd 910003fd a90153f3 f9411c40 (f9402814)
This happens because STA has time to disconnect and reconnect before
batadv_v_elp_throughput_metric_update() delayed work gets scheduled. In
this situation, ath10k_sta_state() can be in the middle of resetting
arsta data when the work queue get chance to be scheduled and ends up
accessing it. Locking wiphy prevents that.
Fixes: 7406353d43c8 ("cfg80211: implement cfg80211_get_station cfg80211 API")
Signed-off-by: Remi Pommarel <repk@triplefau.lt>
Reviewed-by: Nicolas Escande <nico.escande@gmail.com>
Acked-by: Antonio Quartulli <a@unstable.cc>
Link: https://msgid.link/983b24a6a176e0800c01aedcd74480d9b551cb13.1716046653.git.repk@triplefau.lt
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/wireless/util.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/wireless/util.c b/net/wireless/util.c
index 9aa7bdce20b26..57ea6d5b092d4 100644
--- a/net/wireless/util.c
+++ b/net/wireless/util.c
@@ -2399,6 +2399,7 @@ int cfg80211_get_station(struct net_device *dev, const u8 *mac_addr,
{
struct cfg80211_registered_device *rdev;
struct wireless_dev *wdev;
+ int ret;
wdev = dev->ieee80211_ptr;
if (!wdev)
@@ -2410,7 +2411,11 @@ int cfg80211_get_station(struct net_device *dev, const u8 *mac_addr,
memset(sinfo, 0, sizeof(*sinfo));
- return rdev_get_station(rdev, dev, mac_addr, sinfo);
+ wiphy_lock(&rdev->wiphy);
+ ret = rdev_get_station(rdev, dev, mac_addr, sinfo);
+ wiphy_unlock(&rdev->wiphy);
+
+ return ret;
}
EXPORT_SYMBOL(cfg80211_get_station);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 005/267] wifi: cfg80211: pmsr: use correct nla_get_uX functions
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (3 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 004/267] wifi: cfg80211: Lock wiphy in cfg80211_get_station Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 006/267] wifi: iwlwifi: mvm: dont initialize csa_work twice Greg Kroah-Hartman
` (273 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Lin Ma, Johannes Berg, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lin Ma <linma@zju.edu.cn>
[ Upstream commit ab904521f4de52fef4f179d2dfc1877645ef5f5c ]
The commit 9bb7e0f24e7e ("cfg80211: add peer measurement with FTM
initiator API") defines four attributes NL80211_PMSR_FTM_REQ_ATTR_
{NUM_BURSTS_EXP}/{BURST_PERIOD}/{BURST_DURATION}/{FTMS_PER_BURST} in
following ways.
static const struct nla_policy
nl80211_pmsr_ftm_req_attr_policy[NL80211_PMSR_FTM_REQ_ATTR_MAX + 1] = {
...
[NL80211_PMSR_FTM_REQ_ATTR_NUM_BURSTS_EXP] =
NLA_POLICY_MAX(NLA_U8, 15),
[NL80211_PMSR_FTM_REQ_ATTR_BURST_PERIOD] = { .type = NLA_U16 },
[NL80211_PMSR_FTM_REQ_ATTR_BURST_DURATION] =
NLA_POLICY_MAX(NLA_U8, 15),
[NL80211_PMSR_FTM_REQ_ATTR_FTMS_PER_BURST] =
NLA_POLICY_MAX(NLA_U8, 31),
...
};
That is, those attributes are expected to be NLA_U8 and NLA_U16 types.
However, the consumers of these attributes in `pmsr_parse_ftm` blindly
all use `nla_get_u32`, which is incorrect and causes functionality issues
on little-endian platforms. Hence, fix them with the correct `nla_get_u8`
and `nla_get_u16` functions.
Fixes: 9bb7e0f24e7e ("cfg80211: add peer measurement with FTM initiator API")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Link: https://msgid.link/20240521075059.47999-1-linma@zju.edu.cn
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/wireless/pmsr.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/wireless/pmsr.c b/net/wireless/pmsr.c
index 9611aa0bd0513..841a4516793b1 100644
--- a/net/wireless/pmsr.c
+++ b/net/wireless/pmsr.c
@@ -56,7 +56,7 @@ static int pmsr_parse_ftm(struct cfg80211_registered_device *rdev,
out->ftm.burst_period = 0;
if (tb[NL80211_PMSR_FTM_REQ_ATTR_BURST_PERIOD])
out->ftm.burst_period =
- nla_get_u32(tb[NL80211_PMSR_FTM_REQ_ATTR_BURST_PERIOD]);
+ nla_get_u16(tb[NL80211_PMSR_FTM_REQ_ATTR_BURST_PERIOD]);
out->ftm.asap = !!tb[NL80211_PMSR_FTM_REQ_ATTR_ASAP];
if (out->ftm.asap && !capa->ftm.asap) {
@@ -75,7 +75,7 @@ static int pmsr_parse_ftm(struct cfg80211_registered_device *rdev,
out->ftm.num_bursts_exp = 0;
if (tb[NL80211_PMSR_FTM_REQ_ATTR_NUM_BURSTS_EXP])
out->ftm.num_bursts_exp =
- nla_get_u32(tb[NL80211_PMSR_FTM_REQ_ATTR_NUM_BURSTS_EXP]);
+ nla_get_u8(tb[NL80211_PMSR_FTM_REQ_ATTR_NUM_BURSTS_EXP]);
if (capa->ftm.max_bursts_exponent >= 0 &&
out->ftm.num_bursts_exp > capa->ftm.max_bursts_exponent) {
@@ -88,7 +88,7 @@ static int pmsr_parse_ftm(struct cfg80211_registered_device *rdev,
out->ftm.burst_duration = 15;
if (tb[NL80211_PMSR_FTM_REQ_ATTR_BURST_DURATION])
out->ftm.burst_duration =
- nla_get_u32(tb[NL80211_PMSR_FTM_REQ_ATTR_BURST_DURATION]);
+ nla_get_u8(tb[NL80211_PMSR_FTM_REQ_ATTR_BURST_DURATION]);
out->ftm.ftms_per_burst = 0;
if (tb[NL80211_PMSR_FTM_REQ_ATTR_FTMS_PER_BURST])
@@ -107,7 +107,7 @@ static int pmsr_parse_ftm(struct cfg80211_registered_device *rdev,
out->ftm.ftmr_retries = 3;
if (tb[NL80211_PMSR_FTM_REQ_ATTR_NUM_FTMR_RETRIES])
out->ftm.ftmr_retries =
- nla_get_u32(tb[NL80211_PMSR_FTM_REQ_ATTR_NUM_FTMR_RETRIES]);
+ nla_get_u8(tb[NL80211_PMSR_FTM_REQ_ATTR_NUM_FTMR_RETRIES]);
out->ftm.request_lci = !!tb[NL80211_PMSR_FTM_REQ_ATTR_REQUEST_LCI];
if (out->ftm.request_lci && !capa->ftm.request_lci) {
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 006/267] wifi: iwlwifi: mvm: dont initialize csa_work twice
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (4 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 005/267] wifi: cfg80211: pmsr: use correct nla_get_uX functions Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 007/267] wifi: iwlwifi: mvm: revert gen2 TX A-MPDU size to 64 Greg Kroah-Hartman
` (272 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Miri Korenblit, Johannes Berg,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miri Korenblit <miriam.rachel.korenblit@intel.com>
[ Upstream commit 92158790ce4391ce4c35d8dfbce759195e4724cb ]
The initialization of this worker moved to iwl_mvm_mac_init_mvmvif
but we removed only from the pre-MLD version of the add_interface
callback. Remove it also from the MLD version.
Fixes: 0bcc2155983e ("wifi: iwlwifi: mvm: init vif works only once")
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Link: https://msgid.link/20240512152312.4f15b41604f0.Iec912158e5a706175531d3736d77d25adf02fba4@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/wireless/intel/iwlwifi/mvm/mld-mac80211.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/mld-mac80211.c b/drivers/net/wireless/intel/iwlwifi/mvm/mld-mac80211.c
index aef8824469e1e..4d9a872818a52 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/mld-mac80211.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/mld-mac80211.c
@@ -73,8 +73,6 @@ static int iwl_mvm_mld_mac_add_interface(struct ieee80211_hw *hw,
goto out_free_bf;
iwl_mvm_tcm_add_vif(mvm, vif);
- INIT_DELAYED_WORK(&mvmvif->csa_work,
- iwl_mvm_channel_switch_disconnect_wk);
if (vif->type == NL80211_IFTYPE_MONITOR) {
mvm->monitor_on = true;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 007/267] wifi: iwlwifi: mvm: revert gen2 TX A-MPDU size to 64
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (5 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 006/267] wifi: iwlwifi: mvm: dont initialize csa_work twice Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 008/267] wifi: iwlwifi: mvm: set properly mac header Greg Kroah-Hartman
` (271 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Johannes Berg, Liad Kaufman,
Luciano Coelho, Miri Korenblit, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Berg <johannes.berg@intel.com>
[ Upstream commit 4a7aace2899711592327463c1a29ffee44fcc66e ]
We don't actually support >64 even for HE devices, so revert
back to 64. This fixes an issue where the session is refused
because the queue is configured differently from the actual
session later.
Fixes: 514c30696fbc ("iwlwifi: add support for IEEE802.11ax")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Liad Kaufman <liad.kaufman@intel.com>
Reviewed-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240510170500.52f7b4cf83aa.If47e43adddf7fe250ed7f5571fbb35d8221c7c47@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/wireless/intel/iwlwifi/mvm/rs.h | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/rs.h b/drivers/net/wireless/intel/iwlwifi/mvm/rs.h
index 1ca375a5cf6b5..639cecc7a6e60 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/rs.h
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/rs.h
@@ -122,13 +122,8 @@ enum {
#define LINK_QUAL_AGG_FRAME_LIMIT_DEF (63)
#define LINK_QUAL_AGG_FRAME_LIMIT_MAX (63)
-/*
- * FIXME - various places in firmware API still use u8,
- * e.g. LQ command and SCD config command.
- * This should be 256 instead.
- */
-#define LINK_QUAL_AGG_FRAME_LIMIT_GEN2_DEF (255)
-#define LINK_QUAL_AGG_FRAME_LIMIT_GEN2_MAX (255)
+#define LINK_QUAL_AGG_FRAME_LIMIT_GEN2_DEF (64)
+#define LINK_QUAL_AGG_FRAME_LIMIT_GEN2_MAX (64)
#define LINK_QUAL_AGG_FRAME_LIMIT_MIN (0)
#define LQ_SIZE 2 /* 2 mode tables: "Active" and "Search" */
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 008/267] wifi: iwlwifi: mvm: set properly mac header
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (6 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 007/267] wifi: iwlwifi: mvm: revert gen2 TX A-MPDU size to 64 Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 009/267] wifi: iwlwifi: dbg_ini: move iwl_dbg_tlv_free outside of debugfs ifdef Greg Kroah-Hartman
` (270 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Mordechay Goodstein, Johannes Berg,
Miri Korenblit, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mordechay Goodstein <mordechay.goodstein@intel.com>
[ Upstream commit 0f2e9f6f21d1ff292363cdfb5bc4d492eeaff76e ]
In the driver we only use skb_put* for adding data to the skb, hence data
never moves and skb_reset_mac_haeder would set mac_header to the first
time data was added and not to mac80211 header, fix this my using the
actual len of bytes added for setting the mac header.
Fixes: 3f7a9d577d47 ("wifi: iwlwifi: mvm: simplify by using SKB MAC header pointer")
Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240510170500.12f2de2909c3.I72a819b96f2fe55bde192a8fd31a4b96c301aa73@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c b/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c
index e9360b555ac93..8cff24d5f5f40 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/rxmq.c
@@ -2730,8 +2730,11 @@ void iwl_mvm_rx_monitor_no_data(struct iwl_mvm *mvm, struct napi_struct *napi,
*
* We mark it as mac header, for upper layers to know where
* all radio tap header ends.
+ *
+ * Since data doesn't move data while putting data on skb and that is
+ * the only way we use, data + len is the next place that hdr would be put
*/
- skb_reset_mac_header(skb);
+ skb_set_mac_header(skb, skb->len);
/*
* Override the nss from the rx_vec since the rate_n_flags has
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 009/267] wifi: iwlwifi: dbg_ini: move iwl_dbg_tlv_free outside of debugfs ifdef
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (7 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 008/267] wifi: iwlwifi: mvm: set properly mac header Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 010/267] wifi: iwlwifi: mvm: check n_ssids before accessing the ssids Greg Kroah-Hartman
` (269 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Shahar S Matityahu, Luciano Coelho,
Miri Korenblit, Johannes Berg, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shahar S Matityahu <shahar.s.matityahu@intel.com>
[ Upstream commit 87821b67dea87addbc4ab093ba752753b002176a ]
The driver should call iwl_dbg_tlv_free even if debugfs is not defined
since ini mode does not depend on debugfs ifdef.
Fixes: 68f6f492c4fa ("iwlwifi: trans: support loading ini TLVs from external file")
Signed-off-by: Shahar S Matityahu <shahar.s.matityahu@intel.com>
Reviewed-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240510170500.c8e3723f55b0.I5e805732b0be31ee6b83c642ec652a34e974ff10@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/wireless/intel/iwlwifi/iwl-drv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-drv.c b/drivers/net/wireless/intel/iwlwifi/iwl-drv.c
index 8faf4e7872bb9..a56593b6135f6 100644
--- a/drivers/net/wireless/intel/iwlwifi/iwl-drv.c
+++ b/drivers/net/wireless/intel/iwlwifi/iwl-drv.c
@@ -1824,8 +1824,8 @@ struct iwl_drv *iwl_drv_start(struct iwl_trans *trans)
err_fw:
#ifdef CONFIG_IWLWIFI_DEBUGFS
debugfs_remove_recursive(drv->dbgfs_drv);
- iwl_dbg_tlv_free(drv->trans);
#endif
+ iwl_dbg_tlv_free(drv->trans);
kfree(drv);
err:
return ERR_PTR(ret);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 010/267] wifi: iwlwifi: mvm: check n_ssids before accessing the ssids
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (8 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 009/267] wifi: iwlwifi: dbg_ini: move iwl_dbg_tlv_free outside of debugfs ifdef Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 011/267] wifi: iwlwifi: mvm: dont read past the mfuart notifcation Greg Kroah-Hartman
` (268 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Miri Korenblit, Ilan Peer,
Johannes Berg, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miri Korenblit <miriam.rachel.korenblit@intel.com>
[ Upstream commit 60d62757df30b74bf397a2847a6db7385c6ee281 ]
In some versions of cfg80211, the ssids poinet might be a valid one even
though n_ssids is 0. Accessing the pointer in this case will cuase an
out-of-bound access. Fix this by checking n_ssids first.
Fixes: c1a7515393e4 ("iwlwifi: mvm: add adaptive dwell support")
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Reviewed-by: Ilan Peer <ilan.peer@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Link: https://msgid.link/20240513132416.6e4d1762bf0d.I5a0e6cc8f02050a766db704d15594c61fe583d45@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/wireless/intel/iwlwifi/mvm/scan.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/scan.c b/drivers/net/wireless/intel/iwlwifi/mvm/scan.c
index 03ec900a33433..0841f1d6dc475 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/scan.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/scan.c
@@ -1304,7 +1304,7 @@ static void iwl_mvm_scan_umac_dwell(struct iwl_mvm *mvm,
if (IWL_MVM_ADWELL_MAX_BUDGET)
cmd->v7.adwell_max_budget =
cpu_to_le16(IWL_MVM_ADWELL_MAX_BUDGET);
- else if (params->ssids && params->ssids[0].ssid_len)
+ else if (params->n_ssids && params->ssids[0].ssid_len)
cmd->v7.adwell_max_budget =
cpu_to_le16(IWL_SCAN_ADWELL_MAX_BUDGET_DIRECTED_SCAN);
else
@@ -1406,7 +1406,7 @@ iwl_mvm_scan_umac_dwell_v11(struct iwl_mvm *mvm,
if (IWL_MVM_ADWELL_MAX_BUDGET)
general_params->adwell_max_budget =
cpu_to_le16(IWL_MVM_ADWELL_MAX_BUDGET);
- else if (params->ssids && params->ssids[0].ssid_len)
+ else if (params->n_ssids && params->ssids[0].ssid_len)
general_params->adwell_max_budget =
cpu_to_le16(IWL_SCAN_ADWELL_MAX_BUDGET_DIRECTED_SCAN);
else
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 011/267] wifi: iwlwifi: mvm: dont read past the mfuart notifcation
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (9 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 010/267] wifi: iwlwifi: mvm: check n_ssids before accessing the ssids Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 012/267] wifi: mac80211: correctly parse Spatial Reuse Parameter Set element Greg Kroah-Hartman
` (267 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Emmanuel Grumbach, Johannes Berg,
Miri Korenblit, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
[ Upstream commit 4bb95f4535489ed830cf9b34b0a891e384d1aee4 ]
In case the firmware sends a notification that claims it has more data
than it has, we will read past that was allocated for the notification.
Remove the print of the buffer, we won't see it by default. If needed,
we can see the content with tracing.
This was reported by KFENCE.
Fixes: bdccdb854f2f ("iwlwifi: mvm: support MFUART dump in case of MFUART assert")
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240513132416.ba82a01a559e.Ia91dd20f5e1ca1ad380b95e68aebf2794f553d9b@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/wireless/intel/iwlwifi/mvm/fw.c | 10 ----------
1 file changed, 10 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/fw.c b/drivers/net/wireless/intel/iwlwifi/mvm/fw.c
index 1d5ee4330f29f..51f396287dc69 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/fw.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/fw.c
@@ -92,20 +92,10 @@ void iwl_mvm_mfu_assert_dump_notif(struct iwl_mvm *mvm,
{
struct iwl_rx_packet *pkt = rxb_addr(rxb);
struct iwl_mfu_assert_dump_notif *mfu_dump_notif = (void *)pkt->data;
- __le32 *dump_data = mfu_dump_notif->data;
- int n_words = le32_to_cpu(mfu_dump_notif->data_size) / sizeof(__le32);
- int i;
if (mfu_dump_notif->index_num == 0)
IWL_INFO(mvm, "MFUART assert id 0x%x occurred\n",
le32_to_cpu(mfu_dump_notif->assert_id));
-
- for (i = 0; i < n_words; i++)
- IWL_DEBUG_INFO(mvm,
- "MFUART assert dump, dword %u: 0x%08x\n",
- le16_to_cpu(mfu_dump_notif->index_num) *
- n_words + i,
- le32_to_cpu(dump_data[i]));
}
static bool iwl_alive_fn(struct iwl_notif_wait_data *notif_wait,
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 012/267] wifi: mac80211: correctly parse Spatial Reuse Parameter Set element
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (10 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 011/267] wifi: iwlwifi: mvm: dont read past the mfuart notifcation Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 013/267] scsi: ufs: mcq: Fix error output and clean up ufshcd_mcq_abort() Greg Kroah-Hartman
` (266 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Lingbo Kong, Johannes Berg,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lingbo Kong <quic_lingbok@quicinc.com>
[ Upstream commit a26d8dc5227f449a54518a8b40733a54c6600a8b ]
Currently, the way of parsing Spatial Reuse Parameter Set element is
incorrect and some members of struct ieee80211_he_obss_pd are not assigned.
To address this issue, it must be parsed in the order of the elements of
Spatial Reuse Parameter Set defined in the IEEE Std 802.11ax specification.
The diagram of the Spatial Reuse Parameter Set element (IEEE Std 802.11ax
-2021-9.4.2.252).
-------------------------------------------------------------------------
| | | | |Non-SRG| SRG | SRG | SRG | SRG |
|Element|Length| Element | SR |OBSS PD|OBSS PD|OBSS PD| BSS |Partial|
| ID | | ID |Control| Max | Min | Max |Color | BSSID |
| | |Extension| | Offset| Offset|Offset |Bitmap|Bitmap |
-------------------------------------------------------------------------
Fixes: 1ced169cc1c2 ("mac80211: allow setting spatial reuse parameters from bss_conf")
Signed-off-by: Lingbo Kong <quic_lingbok@quicinc.com>
Link: https://msgid.link/20240516021854.5682-3-quic_lingbok@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/mac80211/he.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/net/mac80211/he.c b/net/mac80211/he.c
index 9f5ffdc9db284..ecbb042dd0433 100644
--- a/net/mac80211/he.c
+++ b/net/mac80211/he.c
@@ -230,15 +230,21 @@ ieee80211_he_spr_ie_to_bss_conf(struct ieee80211_vif *vif,
if (!he_spr_ie_elem)
return;
+
+ he_obss_pd->sr_ctrl = he_spr_ie_elem->he_sr_control;
data = he_spr_ie_elem->optional;
if (he_spr_ie_elem->he_sr_control &
IEEE80211_HE_SPR_NON_SRG_OFFSET_PRESENT)
- data++;
+ he_obss_pd->non_srg_max_offset = *data++;
+
if (he_spr_ie_elem->he_sr_control &
IEEE80211_HE_SPR_SRG_INFORMATION_PRESENT) {
- he_obss_pd->max_offset = *data++;
he_obss_pd->min_offset = *data++;
+ he_obss_pd->max_offset = *data++;
+ memcpy(he_obss_pd->bss_color_bitmap, data, 8);
+ data += 8;
+ memcpy(he_obss_pd->partial_bssid_bitmap, data, 8);
he_obss_pd->enable = true;
}
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 013/267] scsi: ufs: mcq: Fix error output and clean up ufshcd_mcq_abort()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (11 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 012/267] wifi: mac80211: correctly parse Spatial Reuse Parameter Set element Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 014/267] RISC-V: KVM: No need to use mask when hart-index-bit is 0 Greg Kroah-Hartman
` (265 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Chanwoo Lee, Bart Van Assche,
Martin K. Petersen, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chanwoo Lee <cw9316.lee@samsung.com>
[ Upstream commit d53b681ce9ca7db5ef4ecb8d2cf465ae4a031264 ]
An error unrelated to ufshcd_try_to_abort_task is being logged and can
cause confusion. Modify ufshcd_mcq_abort() to print the result of the abort
failure. For readability, return immediately instead of 'goto'.
Fixes: f1304d442077 ("scsi: ufs: mcq: Added ufshcd_mcq_abort()")
Signed-off-by: Chanwoo Lee <cw9316.lee@samsung.com>
Link: https://lore.kernel.org/r/20240524015904.1116005-1-cw9316.lee@samsung.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/ufs/core/ufs-mcq.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/drivers/ufs/core/ufs-mcq.c b/drivers/ufs/core/ufs-mcq.c
index 7ae3096814282..4e84ee6564d4b 100644
--- a/drivers/ufs/core/ufs-mcq.c
+++ b/drivers/ufs/core/ufs-mcq.c
@@ -630,20 +630,20 @@ int ufshcd_mcq_abort(struct scsi_cmnd *cmd)
struct ufshcd_lrb *lrbp = &hba->lrb[tag];
struct ufs_hw_queue *hwq;
unsigned long flags;
- int err = FAILED;
+ int err;
if (!ufshcd_cmd_inflight(lrbp->cmd)) {
dev_err(hba->dev,
"%s: skip abort. cmd at tag %d already completed.\n",
__func__, tag);
- goto out;
+ return FAILED;
}
/* Skip task abort in case previous aborts failed and report failure */
if (lrbp->req_abort_skip) {
dev_err(hba->dev, "%s: skip abort. tag %d failed earlier\n",
__func__, tag);
- goto out;
+ return FAILED;
}
hwq = ufshcd_mcq_req_to_hwq(hba, scsi_cmd_to_rq(cmd));
@@ -655,7 +655,7 @@ int ufshcd_mcq_abort(struct scsi_cmnd *cmd)
*/
dev_err(hba->dev, "%s: cmd found in sq. hwq=%d, tag=%d\n",
__func__, hwq->id, tag);
- goto out;
+ return FAILED;
}
/*
@@ -663,18 +663,17 @@ int ufshcd_mcq_abort(struct scsi_cmnd *cmd)
* in the completion queue either. Query the device to see if
* the command is being processed in the device.
*/
- if (ufshcd_try_to_abort_task(hba, tag)) {
+ err = ufshcd_try_to_abort_task(hba, tag);
+ if (err) {
dev_err(hba->dev, "%s: device abort failed %d\n", __func__, err);
lrbp->req_abort_skip = true;
- goto out;
+ return FAILED;
}
- err = SUCCESS;
spin_lock_irqsave(&hwq->cq_lock, flags);
if (ufshcd_cmd_inflight(lrbp->cmd))
ufshcd_release_scsi_cmd(hba, lrbp);
spin_unlock_irqrestore(&hwq->cq_lock, flags);
-out:
- return err;
+ return SUCCESS;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 014/267] RISC-V: KVM: No need to use mask when hart-index-bit is 0
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (12 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 013/267] scsi: ufs: mcq: Fix error output and clean up ufshcd_mcq_abort() Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 015/267] RISC-V: KVM: Fix incorrect reg_subtype labels in kvm_riscv_vcpu_set_reg_isa_ext function Greg Kroah-Hartman
` (264 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Yong-Xuan Wang, Andrew Jones,
Anup Patel, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yong-Xuan Wang <yongxuan.wang@sifive.com>
[ Upstream commit 2d707b4e37f9b0c37b8b2392f91b04c5b63ea538 ]
When the maximum hart number within groups is 1, hart-index-bit is set to
0. Consequently, there is no need to restore the hart ID from IMSIC
addresses and hart-index-bit settings. Currently, QEMU and kvmtool do not
pass correct hart-index-bit values when the maximum hart number is a
power of 2, thereby avoiding this issue. Corresponding patches for QEMU
and kvmtool will also be dispatched.
Fixes: 89d01306e34d ("RISC-V: KVM: Implement device interface for AIA irqchip")
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240415064905.25184-1-yongxuan.wang@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/riscv/kvm/aia_device.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/arch/riscv/kvm/aia_device.c b/arch/riscv/kvm/aia_device.c
index 0eb689351b7d0..5cd407c6a8e4f 100644
--- a/arch/riscv/kvm/aia_device.c
+++ b/arch/riscv/kvm/aia_device.c
@@ -237,10 +237,11 @@ static gpa_t aia_imsic_ppn(struct kvm_aia *aia, gpa_t addr)
static u32 aia_imsic_hart_index(struct kvm_aia *aia, gpa_t addr)
{
- u32 hart, group = 0;
+ u32 hart = 0, group = 0;
- hart = (addr >> (aia->nr_guest_bits + IMSIC_MMIO_PAGE_SHIFT)) &
- GENMASK_ULL(aia->nr_hart_bits - 1, 0);
+ if (aia->nr_hart_bits)
+ hart = (addr >> (aia->nr_guest_bits + IMSIC_MMIO_PAGE_SHIFT)) &
+ GENMASK_ULL(aia->nr_hart_bits - 1, 0);
if (aia->nr_group_bits)
group = (addr >> aia->nr_group_shift) &
GENMASK_ULL(aia->nr_group_bits - 1, 0);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 015/267] RISC-V: KVM: Fix incorrect reg_subtype labels in kvm_riscv_vcpu_set_reg_isa_ext function
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (13 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 014/267] RISC-V: KVM: No need to use mask when hart-index-bit is 0 Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 016/267] ax25: Fix refcount imbalance on inbound connections Greg Kroah-Hartman
` (263 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Quan Zhou, Andrew Jones, Anup Patel,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Quan Zhou <zhouquan@iscas.ac.cn>
[ Upstream commit c66f3b40b17d3dfc4b6abb5efde8e71c46971821 ]
In the function kvm_riscv_vcpu_set_reg_isa_ext, the original code
used incorrect reg_subtype labels KVM_REG_RISCV_SBI_MULTI_EN/DIS.
These have been corrected to KVM_REG_RISCV_ISA_MULTI_EN/DIS respectively.
Although they are numerically equivalent, the actual processing
will not result in errors, but it may lead to ambiguous code semantics.
Fixes: 613029442a4b ("RISC-V: KVM: Extend ONE_REG to enable/disable multiple ISA extensions")
Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/ff1c6771a67d660db94372ac9aaa40f51e5e0090.1716429371.git.zhouquan@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/riscv/kvm/vcpu_onereg.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/kvm/vcpu_onereg.c b/arch/riscv/kvm/vcpu_onereg.c
index b7e0e03c69b1e..d520b25d85616 100644
--- a/arch/riscv/kvm/vcpu_onereg.c
+++ b/arch/riscv/kvm/vcpu_onereg.c
@@ -614,9 +614,9 @@ static int kvm_riscv_vcpu_set_reg_isa_ext(struct kvm_vcpu *vcpu,
switch (reg_subtype) {
case KVM_REG_RISCV_ISA_SINGLE:
return riscv_vcpu_set_isa_ext_single(vcpu, reg_num, reg_val);
- case KVM_REG_RISCV_SBI_MULTI_EN:
+ case KVM_REG_RISCV_ISA_MULTI_EN:
return riscv_vcpu_set_isa_ext_multi(vcpu, reg_num, reg_val, true);
- case KVM_REG_RISCV_SBI_MULTI_DIS:
+ case KVM_REG_RISCV_ISA_MULTI_DIS:
return riscv_vcpu_set_isa_ext_multi(vcpu, reg_num, reg_val, false);
default:
return -ENOENT;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 016/267] ax25: Fix refcount imbalance on inbound connections
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (14 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 015/267] RISC-V: KVM: Fix incorrect reg_subtype labels in kvm_riscv_vcpu_set_reg_isa_ext function Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 017/267] ax25: Replace kfree() in ax25_dev_free() with ax25_dev_put() Greg Kroah-Hartman
` (262 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Lars Kellogg-Stedman, Duoming Zhou,
Dan Cross, Chris Maness, Dan Carpenter, Jakub Kicinski,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lars Kellogg-Stedman <lars@oddbit.com>
[ Upstream commit 3c34fb0bd4a4237592c5ecb5b2e2531900c55774 ]
When releasing a socket in ax25_release(), we call netdev_put() to
decrease the refcount on the associated ax.25 device. However, the
execution path for accepting an incoming connection never calls
netdev_hold(). This imbalance leads to refcount errors, and ultimately
to kernel crashes.
A typical call trace for the above situation will start with one of the
following errors:
refcount_t: decrement hit 0; leaking memory.
refcount_t: underflow; use-after-free.
And will then have a trace like:
Call Trace:
<TASK>
? show_regs+0x64/0x70
? __warn+0x83/0x120
? refcount_warn_saturate+0xb2/0x100
? report_bug+0x158/0x190
? prb_read_valid+0x20/0x30
? handle_bug+0x3e/0x70
? exc_invalid_op+0x1c/0x70
? asm_exc_invalid_op+0x1f/0x30
? refcount_warn_saturate+0xb2/0x100
? refcount_warn_saturate+0xb2/0x100
ax25_release+0x2ad/0x360
__sock_release+0x35/0xa0
sock_close+0x19/0x20
[...]
On reboot (or any attempt to remove the interface), the kernel gets
stuck in an infinite loop:
unregister_netdevice: waiting for ax0 to become free. Usage count = 0
This patch corrects these issues by ensuring that we call netdev_hold()
and ax25_dev_hold() for new connections in ax25_accept(). This makes the
logic leading to ax25_accept() match the logic for ax25_bind(): in both
cases we increment the refcount, which is ultimately decremented in
ax25_release().
Fixes: 9fd75b66b8f6 ("ax25: Fix refcount leaks caused by ax25_cb_del()")
Signed-off-by: Lars Kellogg-Stedman <lars@oddbit.com>
Tested-by: Duoming Zhou <duoming@zju.edu.cn>
Tested-by: Dan Cross <crossd@gmail.com>
Tested-by: Chris Maness <christopher.maness@gmail.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/20240529210242.3346844-2-lars@oddbit.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/ax25/af_ax25.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index 9d11d26e46c0e..26a3095bec462 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -1378,8 +1378,10 @@ static int ax25_accept(struct socket *sock, struct socket *newsock, int flags,
{
struct sk_buff *skb;
struct sock *newsk;
+ ax25_dev *ax25_dev;
DEFINE_WAIT(wait);
struct sock *sk;
+ ax25_cb *ax25;
int err = 0;
if (sock->state != SS_UNCONNECTED)
@@ -1434,6 +1436,10 @@ static int ax25_accept(struct socket *sock, struct socket *newsock, int flags,
kfree_skb(skb);
sk_acceptq_removed(sk);
newsock->state = SS_CONNECTED;
+ ax25 = sk_to_ax25(newsk);
+ ax25_dev = ax25->ax25_dev;
+ netdev_hold(ax25_dev->dev, &ax25->dev_tracker, GFP_ATOMIC);
+ ax25_dev_hold(ax25_dev);
out:
release_sock(sk);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 017/267] ax25: Replace kfree() in ax25_dev_free() with ax25_dev_put()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (15 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 016/267] ax25: Fix refcount imbalance on inbound connections Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 018/267] net/ncsi: Simplify Kconfig/dts control flow Greg Kroah-Hartman
` (261 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Dan Carpenter, Duoming Zhou,
Jakub Kicinski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Duoming Zhou <duoming@zju.edu.cn>
[ Upstream commit 166fcf86cd34e15c7f383eda4642d7a212393008 ]
The object "ax25_dev" is managed by reference counting. Thus it should
not be directly released by kfree(), replace with ax25_dev_put().
Fixes: d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs")
Suggested-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/20240530051733.11416-1-duoming@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/ax25/ax25_dev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ax25/ax25_dev.c b/net/ax25/ax25_dev.c
index c9d55b99a7a57..67ae6b8c52989 100644
--- a/net/ax25/ax25_dev.c
+++ b/net/ax25/ax25_dev.c
@@ -193,7 +193,7 @@ void __exit ax25_dev_free(void)
list_for_each_entry_safe(s, n, &ax25_dev_list, list) {
netdev_put(s->dev, &s->dev_tracker);
list_del(&s->list);
- kfree(s);
+ ax25_dev_put(s);
}
spin_unlock_bh(&ax25_dev_lock);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 018/267] net/ncsi: Simplify Kconfig/dts control flow
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (16 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 017/267] ax25: Replace kfree() in ax25_dev_free() with ax25_dev_put() Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 019/267] net/ncsi: Fix the multi thread manner of NCSI driver Greg Kroah-Hartman
` (260 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Peter Delevoryas, David S. Miller,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Delevoryas <peter@pjd.dev>
[ Upstream commit c797ce168930ce3d62a9b7fc4d7040963ee6a01e ]
Background:
1. CONFIG_NCSI_OEM_CMD_KEEP_PHY
If this is enabled, we send an extra OEM Intel command in the probe
sequence immediately after discovering a channel (e.g. after "Clear
Initial State").
2. CONFIG_NCSI_OEM_CMD_GET_MAC
If this is enabled, we send one of 3 OEM "Get MAC Address" commands from
Broadcom, Mellanox (Nvidida), and Intel in the *configuration* sequence
for a channel.
3. mellanox,multi-host (or mlx,multi-host)
Introduced by this patch:
https://lore.kernel.org/all/20200108234341.2590674-1-vijaykhemka@fb.com/
Which was actually originally from cosmo.chou@quantatw.com:
https://github.com/facebook/openbmc-linux/commit/9f132a10ec48db84613519258cd8a317fb9c8f1b
Cosmo claimed that the Nvidia ConnectX-4 and ConnectX-6 NIC's don't
respond to Get Version ID, et. al in the probe sequence unless you send
the Set MC Affinity command first.
Problem Statement:
We've been using a combination of #ifdef code blocks and IS_ENABLED()
conditions to conditionally send these OEM commands.
It makes adding any new code around these commands hard to understand.
Solution:
In this patch, I just want to remove the conditionally compiled blocks
of code, and always use IS_ENABLED(...) to do dynamic control flow.
I don't think the small amount of code this adds to non-users of the OEM
Kconfigs is a big deal.
Signed-off-by: Peter Delevoryas <peter@pjd.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: e85e271dec02 ("net/ncsi: Fix the multi thread manner of NCSI driver")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/ncsi/ncsi-manage.c | 20 +++-----------------
1 file changed, 3 insertions(+), 17 deletions(-)
diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
index d9da942ad53dd..f3d7fe86fea13 100644
--- a/net/ncsi/ncsi-manage.c
+++ b/net/ncsi/ncsi-manage.c
@@ -689,8 +689,6 @@ static int set_one_vid(struct ncsi_dev_priv *ndp, struct ncsi_channel *nc,
return 0;
}
-#if IS_ENABLED(CONFIG_NCSI_OEM_CMD_KEEP_PHY)
-
static int ncsi_oem_keep_phy_intel(struct ncsi_cmd_arg *nca)
{
unsigned char data[NCSI_OEM_INTEL_CMD_KEEP_PHY_LEN];
@@ -716,10 +714,6 @@ static int ncsi_oem_keep_phy_intel(struct ncsi_cmd_arg *nca)
return ret;
}
-#endif
-
-#if IS_ENABLED(CONFIG_NCSI_OEM_CMD_GET_MAC)
-
/* NCSI OEM Command APIs */
static int ncsi_oem_gma_handler_bcm(struct ncsi_cmd_arg *nca)
{
@@ -856,8 +850,6 @@ static int ncsi_gma_handler(struct ncsi_cmd_arg *nca, unsigned int mf_id)
return nch->handler(nca);
}
-#endif /* CONFIG_NCSI_OEM_CMD_GET_MAC */
-
/* Determine if a given channel from the channel_queue should be used for Tx */
static bool ncsi_channel_is_tx(struct ncsi_dev_priv *ndp,
struct ncsi_channel *nc)
@@ -1039,20 +1031,18 @@ static void ncsi_configure_channel(struct ncsi_dev_priv *ndp)
goto error;
}
- nd->state = ncsi_dev_state_config_oem_gma;
+ nd->state = IS_ENABLED(CONFIG_NCSI_OEM_CMD_GET_MAC)
+ ? ncsi_dev_state_config_oem_gma
+ : ncsi_dev_state_config_clear_vids;
break;
case ncsi_dev_state_config_oem_gma:
nd->state = ncsi_dev_state_config_clear_vids;
- ret = -1;
-#if IS_ENABLED(CONFIG_NCSI_OEM_CMD_GET_MAC)
nca.type = NCSI_PKT_CMD_OEM;
nca.package = np->id;
nca.channel = nc->id;
ndp->pending_req_num = 1;
ret = ncsi_gma_handler(&nca, nc->version.mf_id);
-#endif /* CONFIG_NCSI_OEM_CMD_GET_MAC */
-
if (ret < 0)
schedule_work(&ndp->work);
@@ -1404,7 +1394,6 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp)
schedule_work(&ndp->work);
break;
-#if IS_ENABLED(CONFIG_NCSI_OEM_CMD_GET_MAC)
case ncsi_dev_state_probe_mlx_gma:
ndp->pending_req_num = 1;
@@ -1429,7 +1418,6 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp)
nd->state = ncsi_dev_state_probe_cis;
break;
-#endif /* CONFIG_NCSI_OEM_CMD_GET_MAC */
case ncsi_dev_state_probe_cis:
ndp->pending_req_num = NCSI_RESERVED_CHANNEL;
@@ -1447,7 +1435,6 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp)
if (IS_ENABLED(CONFIG_NCSI_OEM_CMD_KEEP_PHY))
nd->state = ncsi_dev_state_probe_keep_phy;
break;
-#if IS_ENABLED(CONFIG_NCSI_OEM_CMD_KEEP_PHY)
case ncsi_dev_state_probe_keep_phy:
ndp->pending_req_num = 1;
@@ -1460,7 +1447,6 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp)
nd->state = ncsi_dev_state_probe_gvi;
break;
-#endif /* CONFIG_NCSI_OEM_CMD_KEEP_PHY */
case ncsi_dev_state_probe_gvi:
case ncsi_dev_state_probe_gc:
case ncsi_dev_state_probe_gls:
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 019/267] net/ncsi: Fix the multi thread manner of NCSI driver
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (17 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 018/267] net/ncsi: Simplify Kconfig/dts control flow Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 020/267] net: phy: micrel: fix KSZ9477 PHY issues after suspend/resume Greg Kroah-Hartman
` (259 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, DelphineCCChiu, Jakub Kicinski,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: DelphineCCChiu <delphine_cc_chiu@wiwynn.com>
[ Upstream commit e85e271dec0270982afed84f70dc37703fcc1d52 ]
Currently NCSI driver will send several NCSI commands back to back without
waiting the response of previous NCSI command or timeout in some state
when NIC have multi channel. This operation against the single thread
manner defined by NCSI SPEC(section 6.3.2.3 in DSP0222_1.1.1)
According to NCSI SPEC(section 6.2.13.1 in DSP0222_1.1.1), we should probe
one channel at a time by sending NCSI commands (Clear initial state, Get
version ID, Get capabilities...), than repeat this steps until the max
number of channels which we got from NCSI command (Get capabilities) has
been probed.
Fixes: e6f44ed6d04d ("net/ncsi: Package and channel management")
Signed-off-by: DelphineCCChiu <delphine_cc_chiu@wiwynn.com>
Link: https://lore.kernel.org/r/20240529065856.825241-1-delphine_cc_chiu@wiwynn.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/ncsi/internal.h | 2 ++
net/ncsi/ncsi-manage.c | 73 +++++++++++++++++++++---------------------
net/ncsi/ncsi-rsp.c | 4 ++-
3 files changed, 41 insertions(+), 38 deletions(-)
diff --git a/net/ncsi/internal.h b/net/ncsi/internal.h
index 374412ed780b6..ef0f8f73826f5 100644
--- a/net/ncsi/internal.h
+++ b/net/ncsi/internal.h
@@ -325,6 +325,7 @@ struct ncsi_dev_priv {
spinlock_t lock; /* Protect the NCSI device */
unsigned int package_probe_id;/* Current ID during probe */
unsigned int package_num; /* Number of packages */
+ unsigned int channel_probe_id;/* Current cahnnel ID during probe */
struct list_head packages; /* List of packages */
struct ncsi_channel *hot_channel; /* Channel was ever active */
struct ncsi_request requests[256]; /* Request table */
@@ -343,6 +344,7 @@ struct ncsi_dev_priv {
bool multi_package; /* Enable multiple packages */
bool mlx_multi_host; /* Enable multi host Mellanox */
u32 package_whitelist; /* Packages to configure */
+ unsigned char channel_count; /* Num of channels to probe */
};
struct ncsi_cmd_arg {
diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
index f3d7fe86fea13..90c6cf676221a 100644
--- a/net/ncsi/ncsi-manage.c
+++ b/net/ncsi/ncsi-manage.c
@@ -510,17 +510,19 @@ static void ncsi_suspend_channel(struct ncsi_dev_priv *ndp)
break;
case ncsi_dev_state_suspend_gls:
- ndp->pending_req_num = np->channel_num;
+ ndp->pending_req_num = 1;
nca.type = NCSI_PKT_CMD_GLS;
nca.package = np->id;
+ nca.channel = ndp->channel_probe_id;
+ ret = ncsi_xmit_cmd(&nca);
+ if (ret)
+ goto error;
+ ndp->channel_probe_id++;
- nd->state = ncsi_dev_state_suspend_dcnt;
- NCSI_FOR_EACH_CHANNEL(np, nc) {
- nca.channel = nc->id;
- ret = ncsi_xmit_cmd(&nca);
- if (ret)
- goto error;
+ if (ndp->channel_probe_id == ndp->channel_count) {
+ ndp->channel_probe_id = 0;
+ nd->state = ncsi_dev_state_suspend_dcnt;
}
break;
@@ -1340,7 +1342,6 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp)
{
struct ncsi_dev *nd = &ndp->ndev;
struct ncsi_package *np;
- struct ncsi_channel *nc;
struct ncsi_cmd_arg nca;
unsigned char index;
int ret;
@@ -1418,23 +1419,6 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp)
nd->state = ncsi_dev_state_probe_cis;
break;
- case ncsi_dev_state_probe_cis:
- ndp->pending_req_num = NCSI_RESERVED_CHANNEL;
-
- /* Clear initial state */
- nca.type = NCSI_PKT_CMD_CIS;
- nca.package = ndp->active_package->id;
- for (index = 0; index < NCSI_RESERVED_CHANNEL; index++) {
- nca.channel = index;
- ret = ncsi_xmit_cmd(&nca);
- if (ret)
- goto error;
- }
-
- nd->state = ncsi_dev_state_probe_gvi;
- if (IS_ENABLED(CONFIG_NCSI_OEM_CMD_KEEP_PHY))
- nd->state = ncsi_dev_state_probe_keep_phy;
- break;
case ncsi_dev_state_probe_keep_phy:
ndp->pending_req_num = 1;
@@ -1447,14 +1431,17 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp)
nd->state = ncsi_dev_state_probe_gvi;
break;
+ case ncsi_dev_state_probe_cis:
case ncsi_dev_state_probe_gvi:
case ncsi_dev_state_probe_gc:
case ncsi_dev_state_probe_gls:
np = ndp->active_package;
- ndp->pending_req_num = np->channel_num;
+ ndp->pending_req_num = 1;
- /* Retrieve version, capability or link status */
- if (nd->state == ncsi_dev_state_probe_gvi)
+ /* Clear initial state Retrieve version, capability or link status */
+ if (nd->state == ncsi_dev_state_probe_cis)
+ nca.type = NCSI_PKT_CMD_CIS;
+ else if (nd->state == ncsi_dev_state_probe_gvi)
nca.type = NCSI_PKT_CMD_GVI;
else if (nd->state == ncsi_dev_state_probe_gc)
nca.type = NCSI_PKT_CMD_GC;
@@ -1462,19 +1449,29 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp)
nca.type = NCSI_PKT_CMD_GLS;
nca.package = np->id;
- NCSI_FOR_EACH_CHANNEL(np, nc) {
- nca.channel = nc->id;
- ret = ncsi_xmit_cmd(&nca);
- if (ret)
- goto error;
- }
+ nca.channel = ndp->channel_probe_id;
- if (nd->state == ncsi_dev_state_probe_gvi)
+ ret = ncsi_xmit_cmd(&nca);
+ if (ret)
+ goto error;
+
+ if (nd->state == ncsi_dev_state_probe_cis) {
+ nd->state = ncsi_dev_state_probe_gvi;
+ if (IS_ENABLED(CONFIG_NCSI_OEM_CMD_KEEP_PHY) && ndp->channel_probe_id == 0)
+ nd->state = ncsi_dev_state_probe_keep_phy;
+ } else if (nd->state == ncsi_dev_state_probe_gvi) {
nd->state = ncsi_dev_state_probe_gc;
- else if (nd->state == ncsi_dev_state_probe_gc)
+ } else if (nd->state == ncsi_dev_state_probe_gc) {
nd->state = ncsi_dev_state_probe_gls;
- else
+ } else {
+ nd->state = ncsi_dev_state_probe_cis;
+ ndp->channel_probe_id++;
+ }
+
+ if (ndp->channel_probe_id == ndp->channel_count) {
+ ndp->channel_probe_id = 0;
nd->state = ncsi_dev_state_probe_dp;
+ }
break;
case ncsi_dev_state_probe_dp:
ndp->pending_req_num = 1;
@@ -1775,6 +1772,7 @@ struct ncsi_dev *ncsi_register_dev(struct net_device *dev,
ndp->requests[i].ndp = ndp;
timer_setup(&ndp->requests[i].timer, ncsi_request_timeout, 0);
}
+ ndp->channel_count = NCSI_RESERVED_CHANNEL;
spin_lock_irqsave(&ncsi_dev_lock, flags);
list_add_tail_rcu(&ndp->node, &ncsi_dev_list);
@@ -1808,6 +1806,7 @@ int ncsi_start_dev(struct ncsi_dev *nd)
if (!(ndp->flags & NCSI_DEV_PROBED)) {
ndp->package_probe_id = 0;
+ ndp->channel_probe_id = 0;
nd->state = ncsi_dev_state_probe;
schedule_work(&ndp->work);
return 0;
diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c
index 480e80e3c2836..f22d67cb04d37 100644
--- a/net/ncsi/ncsi-rsp.c
+++ b/net/ncsi/ncsi-rsp.c
@@ -795,12 +795,13 @@ static int ncsi_rsp_handler_gc(struct ncsi_request *nr)
struct ncsi_rsp_gc_pkt *rsp;
struct ncsi_dev_priv *ndp = nr->ndp;
struct ncsi_channel *nc;
+ struct ncsi_package *np;
size_t size;
/* Find the channel */
rsp = (struct ncsi_rsp_gc_pkt *)skb_network_header(nr->rsp);
ncsi_find_package_and_channel(ndp, rsp->rsp.common.channel,
- NULL, &nc);
+ &np, &nc);
if (!nc)
return -ENODEV;
@@ -835,6 +836,7 @@ static int ncsi_rsp_handler_gc(struct ncsi_request *nr)
*/
nc->vlan_filter.bitmap = U64_MAX;
nc->vlan_filter.n_vids = rsp->vlan_cnt;
+ np->ndp->channel_count = rsp->channel_cnt;
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 020/267] net: phy: micrel: fix KSZ9477 PHY issues after suspend/resume
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (18 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 019/267] net/ncsi: Fix the multi thread manner of NCSI driver Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 021/267] bpf: Store ref_ctr_offsets values in bpf_uprobe array Greg Kroah-Hartman
` (258 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Tristram Ha, David S. Miller,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tristram Ha <tristram.ha@microchip.com>
[ Upstream commit 6149db4997f582e958da675092f21c666e3b67b7 ]
When the PHY is powered up after powered down most of the registers are
reset, so the PHY setup code needs to be done again. In addition the
interrupt register will need to be setup again so that link status
indication works again.
Fixes: 26dd2974c5b5 ("net: phy: micrel: Move KSZ9477 errata fixes to PHY driver")
Signed-off-by: Tristram Ha <tristram.ha@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/phy/micrel.c | 62 ++++++++++++++++++++++++++++++++++++----
1 file changed, 56 insertions(+), 6 deletions(-)
diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
index fc31fcfb0cdb4..048704758b150 100644
--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
@@ -1821,7 +1821,7 @@ static const struct ksz9477_errata_write ksz9477_errata_writes[] = {
{0x1c, 0x20, 0xeeee},
};
-static int ksz9477_config_init(struct phy_device *phydev)
+static int ksz9477_phy_errata(struct phy_device *phydev)
{
int err;
int i;
@@ -1849,16 +1849,30 @@ static int ksz9477_config_init(struct phy_device *phydev)
return err;
}
+ err = genphy_restart_aneg(phydev);
+ if (err)
+ return err;
+
+ return err;
+}
+
+static int ksz9477_config_init(struct phy_device *phydev)
+{
+ int err;
+
+ /* Only KSZ9897 family of switches needs this fix. */
+ if ((phydev->phy_id & 0xf) == 1) {
+ err = ksz9477_phy_errata(phydev);
+ if (err)
+ return err;
+ }
+
/* According to KSZ9477 Errata DS80000754C (Module 4) all EEE modes
* in this switch shall be regarded as broken.
*/
if (phydev->dev_flags & MICREL_NO_EEE)
phydev->eee_broken_modes = -1;
- err = genphy_restart_aneg(phydev);
- if (err)
- return err;
-
return kszphy_config_init(phydev);
}
@@ -1967,6 +1981,42 @@ static int kszphy_resume(struct phy_device *phydev)
return 0;
}
+static int ksz9477_resume(struct phy_device *phydev)
+{
+ int ret;
+
+ /* No need to initialize registers if not powered down. */
+ ret = phy_read(phydev, MII_BMCR);
+ if (ret < 0)
+ return ret;
+ if (!(ret & BMCR_PDOWN))
+ return 0;
+
+ genphy_resume(phydev);
+
+ /* After switching from power-down to normal mode, an internal global
+ * reset is automatically generated. Wait a minimum of 1 ms before
+ * read/write access to the PHY registers.
+ */
+ usleep_range(1000, 2000);
+
+ /* Only KSZ9897 family of switches needs this fix. */
+ if ((phydev->phy_id & 0xf) == 1) {
+ ret = ksz9477_phy_errata(phydev);
+ if (ret)
+ return ret;
+ }
+
+ /* Enable PHY Interrupts */
+ if (phy_interrupt_is_valid(phydev)) {
+ phydev->interrupts = PHY_INTERRUPT_ENABLED;
+ if (phydev->drv->config_intr)
+ phydev->drv->config_intr(phydev);
+ }
+
+ return 0;
+}
+
static int kszphy_probe(struct phy_device *phydev)
{
const struct kszphy_type *type = phydev->drv->driver_data;
@@ -4916,7 +4966,7 @@ static struct phy_driver ksphy_driver[] = {
.config_intr = kszphy_config_intr,
.handle_interrupt = kszphy_handle_interrupt,
.suspend = genphy_suspend,
- .resume = genphy_resume,
+ .resume = ksz9477_resume,
.get_features = ksz9477_get_features,
} };
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 021/267] bpf: Store ref_ctr_offsets values in bpf_uprobe array
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (19 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 020/267] net: phy: micrel: fix KSZ9477 PHY issues after suspend/resume Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 022/267] bpf: Optimize the free of inner map Greg Kroah-Hartman
` (257 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Jiri Olsa, Andrii Nakryiko, Song Liu,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiri Olsa <jolsa@kernel.org>
[ Upstream commit 4930b7f53a298533bc31d7540b6ea8b79a000331 ]
We will need to return ref_ctr_offsets values through link_info
interface in following change, so we need to keep them around.
Storing ref_ctr_offsets values directly into bpf_uprobe array.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/bpf/20231125193130.834322-3-jolsa@kernel.org
Stable-dep-of: 2884dc7d08d9 ("bpf: Fix a potential use-after-free in bpf_link_free()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
kernel/trace/bpf_trace.c | 14 +++-----------
1 file changed, 3 insertions(+), 11 deletions(-)
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 1e79084a9d9d2..8edbafe0d4cdf 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -3030,6 +3030,7 @@ struct bpf_uprobe_multi_link;
struct bpf_uprobe {
struct bpf_uprobe_multi_link *link;
loff_t offset;
+ unsigned long ref_ctr_offset;
u64 cookie;
struct uprobe_consumer consumer;
};
@@ -3169,7 +3170,6 @@ int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
{
struct bpf_uprobe_multi_link *link = NULL;
unsigned long __user *uref_ctr_offsets;
- unsigned long *ref_ctr_offsets = NULL;
struct bpf_link_primer link_primer;
struct bpf_uprobe *uprobes = NULL;
struct task_struct *task = NULL;
@@ -3244,18 +3244,12 @@ int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
if (!uprobes || !link)
goto error_free;
- if (uref_ctr_offsets) {
- ref_ctr_offsets = kvcalloc(cnt, sizeof(*ref_ctr_offsets), GFP_KERNEL);
- if (!ref_ctr_offsets)
- goto error_free;
- }
-
for (i = 0; i < cnt; i++) {
if (ucookies && __get_user(uprobes[i].cookie, ucookies + i)) {
err = -EFAULT;
goto error_free;
}
- if (uref_ctr_offsets && __get_user(ref_ctr_offsets[i], uref_ctr_offsets + i)) {
+ if (uref_ctr_offsets && __get_user(uprobes[i].ref_ctr_offset, uref_ctr_offsets + i)) {
err = -EFAULT;
goto error_free;
}
@@ -3286,7 +3280,7 @@ int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
for (i = 0; i < cnt; i++) {
err = uprobe_register_refctr(d_real_inode(link->path.dentry),
uprobes[i].offset,
- ref_ctr_offsets ? ref_ctr_offsets[i] : 0,
+ uprobes[i].ref_ctr_offset,
&uprobes[i].consumer);
if (err) {
bpf_uprobe_unregister(&path, uprobes, i);
@@ -3298,11 +3292,9 @@ int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
if (err)
goto error_free;
- kvfree(ref_ctr_offsets);
return bpf_link_settle(&link_primer);
error_free:
- kvfree(ref_ctr_offsets);
kvfree(uprobes);
kfree(link);
if (task)
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 022/267] bpf: Optimize the free of inner map
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (20 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 021/267] bpf: Store ref_ctr_offsets values in bpf_uprobe array Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 023/267] bpf: Fix a potential use-after-free in bpf_link_free() Greg Kroah-Hartman
` (256 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Hou Tao, Alexei Starovoitov,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hou Tao <houtao1@huawei.com>
[ Upstream commit af66bfd3c8538ed21cf72af18426fc4a408665cf ]
When removing the inner map from the outer map, the inner map will be
freed after one RCU grace period and one RCU tasks trace grace
period, so it is certain that the bpf program, which may access the
inner map, has exited before the inner map is freed.
However there is no need to wait for one RCU tasks trace grace period if
the outer map is only accessed by non-sleepable program. So adding
sleepable_refcnt in bpf_map and increasing sleepable_refcnt when adding
the outer map into env->used_maps for sleepable program. Although the
max number of bpf program is INT_MAX - 1, the number of bpf programs
which are being loaded may be greater than INT_MAX, so using atomic64_t
instead of atomic_t for sleepable_refcnt. When removing the inner map
from the outer map, using sleepable_refcnt to decide whether or not a
RCU tasks trace grace period is needed before freeing the inner map.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20231204140425.1480317-6-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Stable-dep-of: 2884dc7d08d9 ("bpf: Fix a potential use-after-free in bpf_link_free()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
include/linux/bpf.h | 2 ++
kernel/bpf/core.c | 4 ++++
kernel/bpf/map_in_map.c | 14 +++++++++-----
kernel/bpf/syscall.c | 8 ++++++++
kernel/bpf/verifier.c | 4 +++-
5 files changed, 26 insertions(+), 6 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 2ebb5d4d43dc6..e4cd28c38b825 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -296,6 +296,8 @@ struct bpf_map {
bool bypass_spec_v1;
bool frozen; /* write-once; write-protected by freeze_mutex */
bool free_after_mult_rcu_gp;
+ bool free_after_rcu_gp;
+ atomic64_t sleepable_refcnt;
s64 __percpu *elem_count;
};
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 1333273a71ded..05445a4d55181 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2673,12 +2673,16 @@ void __bpf_free_used_maps(struct bpf_prog_aux *aux,
struct bpf_map **used_maps, u32 len)
{
struct bpf_map *map;
+ bool sleepable;
u32 i;
+ sleepable = aux->sleepable;
for (i = 0; i < len; i++) {
map = used_maps[i];
if (map->ops->map_poke_untrack)
map->ops->map_poke_untrack(map, aux);
+ if (sleepable)
+ atomic64_dec(&map->sleepable_refcnt);
bpf_map_put(map);
}
}
diff --git a/kernel/bpf/map_in_map.c b/kernel/bpf/map_in_map.c
index 3248ff5d81617..8ef269e66ba50 100644
--- a/kernel/bpf/map_in_map.c
+++ b/kernel/bpf/map_in_map.c
@@ -131,12 +131,16 @@ void bpf_map_fd_put_ptr(struct bpf_map *map, void *ptr, bool need_defer)
{
struct bpf_map *inner_map = ptr;
- /* The inner map may still be used by both non-sleepable and sleepable
- * bpf program, so free it after one RCU grace period and one tasks
- * trace RCU grace period.
+ /* Defer the freeing of inner map according to the sleepable attribute
+ * of bpf program which owns the outer map, so unnecessary waiting for
+ * RCU tasks trace grace period can be avoided.
*/
- if (need_defer)
- WRITE_ONCE(inner_map->free_after_mult_rcu_gp, true);
+ if (need_defer) {
+ if (atomic64_read(&map->sleepable_refcnt))
+ WRITE_ONCE(inner_map->free_after_mult_rcu_gp, true);
+ else
+ WRITE_ONCE(inner_map->free_after_rcu_gp, true);
+ }
bpf_map_put(inner_map);
}
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index e886157a9efbb..e9a68c6043ce5 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -753,8 +753,11 @@ void bpf_map_put(struct bpf_map *map)
/* bpf_map_free_id() must be called first */
bpf_map_free_id(map);
+ WARN_ON_ONCE(atomic64_read(&map->sleepable_refcnt));
if (READ_ONCE(map->free_after_mult_rcu_gp))
call_rcu_tasks_trace(&map->rcu, bpf_map_free_mult_rcu_gp);
+ else if (READ_ONCE(map->free_after_rcu_gp))
+ call_rcu(&map->rcu, bpf_map_free_rcu_gp);
else
bpf_map_free_in_work(map);
}
@@ -5358,6 +5361,11 @@ static int bpf_prog_bind_map(union bpf_attr *attr)
goto out_unlock;
}
+ /* The bpf program will not access the bpf map, but for the sake of
+ * simplicity, increase sleepable_refcnt for sleepable program as well.
+ */
+ if (prog->aux->sleepable)
+ atomic64_inc(&map->sleepable_refcnt);
memcpy(used_maps_new, used_maps_old,
sizeof(used_maps_old[0]) * prog->aux->used_map_cnt);
used_maps_new[prog->aux->used_map_cnt] = map;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 24d7a32f1710e..ec0464c075bb4 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -17732,10 +17732,12 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
return -E2BIG;
}
+ if (env->prog->aux->sleepable)
+ atomic64_inc(&map->sleepable_refcnt);
/* hold the map. If the program is rejected by verifier,
* the map will be released by release_maps() or it
* will be used by the valid program until it's unloaded
- * and all maps are released in free_used_maps()
+ * and all maps are released in bpf_free_used_maps()
*/
bpf_map_inc(map);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 023/267] bpf: Fix a potential use-after-free in bpf_link_free()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (21 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 022/267] bpf: Optimize the free of inner map Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 024/267] KVM: SEV-ES: Disallow SEV-ES guests when X86_FEATURE_LBRV is absent Greg Kroah-Hartman
` (255 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, syzbot+1989ee16d94720836244,
Cong Wang, Daniel Borkmann, Jiri Olsa, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cong Wang <cong.wang@bytedance.com>
[ Upstream commit 2884dc7d08d98a89d8d65121524bb7533183a63a ]
After commit 1a80dbcb2dba, bpf_link can be freed by
link->ops->dealloc_deferred, but the code still tests and uses
link->ops->dealloc afterward, which leads to a use-after-free as
reported by syzbot. Actually, one of them should be sufficient, so
just call one of them instead of both. Also add a WARN_ON() in case
of any problematic implementation.
Fixes: 1a80dbcb2dba ("bpf: support deferring bpf_link dealloc to after RCU grace period")
Reported-by: syzbot+1989ee16d94720836244@syzkaller.appspotmail.com
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240602182703.207276-1-xiyou.wangcong@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
kernel/bpf/syscall.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index e9a68c6043ce5..65df92f5b1922 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2830,6 +2830,7 @@ static int bpf_obj_get(const union bpf_attr *attr)
void bpf_link_init(struct bpf_link *link, enum bpf_link_type type,
const struct bpf_link_ops *ops, struct bpf_prog *prog)
{
+ WARN_ON(ops->dealloc && ops->dealloc_deferred);
atomic64_set(&link->refcnt, 1);
link->type = type;
link->id = 0;
@@ -2888,16 +2889,17 @@ static void bpf_link_defer_dealloc_mult_rcu_gp(struct rcu_head *rcu)
/* bpf_link_free is guaranteed to be called from process context */
static void bpf_link_free(struct bpf_link *link)
{
+ const struct bpf_link_ops *ops = link->ops;
bool sleepable = false;
bpf_link_free_id(link->id);
if (link->prog) {
sleepable = link->prog->aux->sleepable;
/* detach BPF program, clean up used resources */
- link->ops->release(link);
+ ops->release(link);
bpf_prog_put(link->prog);
}
- if (link->ops->dealloc_deferred) {
+ if (ops->dealloc_deferred) {
/* schedule BPF link deallocation; if underlying BPF program
* is sleepable, we need to first wait for RCU tasks trace
* sync, then go through "classic" RCU grace period
@@ -2906,9 +2908,8 @@ static void bpf_link_free(struct bpf_link *link)
call_rcu_tasks_trace(&link->rcu, bpf_link_defer_dealloc_mult_rcu_gp);
else
call_rcu(&link->rcu, bpf_link_defer_dealloc_rcu_gp);
- }
- if (link->ops->dealloc)
- link->ops->dealloc(link);
+ } else if (ops->dealloc)
+ ops->dealloc(link);
}
static void bpf_link_put_deferred(struct work_struct *work)
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 024/267] KVM: SEV-ES: Disallow SEV-ES guests when X86_FEATURE_LBRV is absent
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (22 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 023/267] bpf: Fix a potential use-after-free in bpf_link_free() Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 025/267] KVM: SEV: Do not intercept accesses to MSR_IA32_XSS for SEV-ES guests Greg Kroah-Hartman
` (254 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Ravi Bangoria, Paolo Bonzini,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ravi Bangoria <ravi.bangoria@amd.com>
[ Upstream commit d922056215617eedfbdbc29fe49953423686fe5e ]
As documented in APM[1], LBR Virtualization must be enabled for SEV-ES
guests. So, prevent SEV-ES guests when LBRV support is missing.
[1]: AMD64 Architecture Programmer's Manual Pub. 40332, Rev. 4.07 - June
2023, Vol 2, 15.35.2 Enabling SEV-ES.
https://bugzilla.kernel.org/attachment.cgi?id=304653
Fixes: 376c6d285017 ("KVM: SVM: Provide support for SEV-ES vCPU creation/loading")
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Message-ID: <20240531044644.768-3-ravi.bangoria@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/x86/kvm/svm/sev.c | 6 ++++++
arch/x86/kvm/svm/svm.c | 16 +++++++---------
arch/x86/kvm/svm/svm.h | 1 +
3 files changed, 14 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index c5845f31c34dc..0e643d7a06d9e 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -2264,6 +2264,12 @@ void __init sev_hardware_setup(void)
if (!boot_cpu_has(X86_FEATURE_SEV_ES))
goto out;
+ if (!lbrv) {
+ WARN_ONCE(!boot_cpu_has(X86_FEATURE_LBRV),
+ "LBRV must be present for SEV-ES support");
+ goto out;
+ }
+
/* Has the system been allocated ASIDs for SEV-ES? */
if (min_sev_asid == 1)
goto out;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 1efbe8b33f6a1..9e084e22a12f7 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -214,7 +214,7 @@ int vgif = true;
module_param(vgif, int, 0444);
/* enable/disable LBR virtualization */
-static int lbrv = true;
+int lbrv = true;
module_param(lbrv, int, 0444);
static int tsc_scaling = true;
@@ -5248,6 +5248,12 @@ static __init int svm_hardware_setup(void)
nrips = nrips && boot_cpu_has(X86_FEATURE_NRIPS);
+ if (lbrv) {
+ if (!boot_cpu_has(X86_FEATURE_LBRV))
+ lbrv = false;
+ else
+ pr_info("LBR virtualization supported\n");
+ }
/*
* Note, SEV setup consumes npt_enabled and enable_mmio_caching (which
* may be modified by svm_adjust_mmio_mask()), as well as nrips.
@@ -5301,14 +5307,6 @@ static __init int svm_hardware_setup(void)
svm_x86_ops.set_vnmi_pending = NULL;
}
-
- if (lbrv) {
- if (!boot_cpu_has(X86_FEATURE_LBRV))
- lbrv = false;
- else
- pr_info("LBR virtualization supported\n");
- }
-
if (!enable_pmu)
pr_info("PMU virtualization is disabled\n");
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index be67ab7fdd104..53bc4b0e388be 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -39,6 +39,7 @@ extern int vgif;
extern bool intercept_smi;
extern bool x2avic_enabled;
extern bool vnmi;
+extern int lbrv;
/*
* Clean bits in VMCB.
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 025/267] KVM: SEV: Do not intercept accesses to MSR_IA32_XSS for SEV-ES guests
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (23 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 024/267] KVM: SEV-ES: Disallow SEV-ES guests when X86_FEATURE_LBRV is absent Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 026/267] KVM: SEV-ES: Delegate LBR virtualization to the processor Greg Kroah-Hartman
` (253 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Alexey Kardashevskiy, Tom Lendacky,
Michael Roth, Paolo Bonzini, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Roth <michael.roth@amd.com>
[ Upstream commit a26b7cd2254695f8258cc370f33280db0a9a3813 ]
When intercepts are enabled for MSR_IA32_XSS, the host will swap in/out
the guest-defined values while context-switching to/from guest mode.
However, in the case of SEV-ES, vcpu->arch.guest_state_protected is set,
so the guest-defined value is effectively ignored when switching to
guest mode with the understanding that the VMSA will handle swapping
in/out this register state.
However, SVM is still configured to intercept these accesses for SEV-ES
guests, so the values in the initial MSR_IA32_XSS are effectively
read-only, and a guest will experience undefined behavior if it actually
tries to write to this MSR. Fortunately, only CET/shadowstack makes use
of this register on SEV-ES-capable systems currently, which isn't yet
widely used, but this may become more of an issue in the future.
Additionally, enabling intercepts of MSR_IA32_XSS results in #VC
exceptions in the guest in certain paths that can lead to unexpected #VC
nesting levels. One example is SEV-SNP guests when handling #VC
exceptions for CPUID instructions involving leaf 0xD, subleaf 0x1, since
they will access MSR_IA32_XSS as part of servicing the CPUID #VC, then
generate another #VC when accessing MSR_IA32_XSS, which can lead to
guest crashes if an NMI occurs at that point in time. Running perf on a
guest while it is issuing such a sequence is one example where these can
be problematic.
Address this by disabling intercepts of MSR_IA32_XSS for SEV-ES guests
if the host/guest configuration allows it. If the host/guest
configuration doesn't allow for MSR_IA32_XSS, leave it intercepted so
that it can be caught by the existing checks in
kvm_{set,get}_msr_common() if the guest still attempts to access it.
Fixes: 376c6d285017 ("KVM: SVM: Provide support for SEV-ES vCPU creation/loading")
Cc: Alexey Kardashevskiy <aik@amd.com>
Suggested-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-Id: <20231016132819.1002933-4-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Stable-dep-of: b7e4be0a224f ("KVM: SEV-ES: Delegate LBR virtualization to the processor")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/x86/kvm/svm/sev.c | 19 +++++++++++++++++++
arch/x86/kvm/svm/svm.c | 1 +
arch/x86/kvm/svm/svm.h | 2 +-
3 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 0e643d7a06d9e..f809dcfacc8a3 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -2994,6 +2994,25 @@ static void sev_es_vcpu_after_set_cpuid(struct vcpu_svm *svm)
set_msr_interception(vcpu, svm->msrpm, MSR_TSC_AUX, v_tsc_aux, v_tsc_aux);
}
+
+ /*
+ * For SEV-ES, accesses to MSR_IA32_XSS should not be intercepted if
+ * the host/guest supports its use.
+ *
+ * guest_can_use() checks a number of requirements on the host/guest to
+ * ensure that MSR_IA32_XSS is available, but it might report true even
+ * if X86_FEATURE_XSAVES isn't configured in the guest to ensure host
+ * MSR_IA32_XSS is always properly restored. For SEV-ES, it is better
+ * to further check that the guest CPUID actually supports
+ * X86_FEATURE_XSAVES so that accesses to MSR_IA32_XSS by misbehaved
+ * guests will still get intercepted and caught in the normal
+ * kvm_emulate_rdmsr()/kvm_emulated_wrmsr() paths.
+ */
+ if (guest_can_use(vcpu, X86_FEATURE_XSAVES) &&
+ guest_cpuid_has(vcpu, X86_FEATURE_XSAVES))
+ set_msr_interception(vcpu, svm->msrpm, MSR_IA32_XSS, 1, 1);
+ else
+ set_msr_interception(vcpu, svm->msrpm, MSR_IA32_XSS, 0, 0);
}
void sev_vcpu_after_set_cpuid(struct vcpu_svm *svm)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 9e084e22a12f7..08f1397138c80 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -103,6 +103,7 @@ static const struct svm_direct_access_msrs {
{ .index = MSR_IA32_LASTBRANCHTOIP, .always = false },
{ .index = MSR_IA32_LASTINTFROMIP, .always = false },
{ .index = MSR_IA32_LASTINTTOIP, .always = false },
+ { .index = MSR_IA32_XSS, .always = false },
{ .index = MSR_EFER, .always = false },
{ .index = MSR_IA32_CR_PAT, .always = false },
{ .index = MSR_AMD64_SEV_ES_GHCB, .always = true },
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 53bc4b0e388be..fb0ac8497fb20 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -30,7 +30,7 @@
#define IOPM_SIZE PAGE_SIZE * 3
#define MSRPM_SIZE PAGE_SIZE * 2
-#define MAX_DIRECT_ACCESS_MSRS 46
+#define MAX_DIRECT_ACCESS_MSRS 47
#define MSRPM_OFFSETS 32
extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
extern bool npt_enabled;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 026/267] KVM: SEV-ES: Delegate LBR virtualization to the processor
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (24 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 025/267] KVM: SEV: Do not intercept accesses to MSR_IA32_XSS for SEV-ES guests Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 027/267] vmxnet3: disable rx data ring on dma allocation failure Greg Kroah-Hartman
` (252 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Nikunj A Dadhania, Ravi Bangoria,
Paolo Bonzini, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ravi Bangoria <ravi.bangoria@amd.com>
[ Upstream commit b7e4be0a224fe5c6be30c1c8bdda8d2317ad6ba4 ]
As documented in APM[1], LBR Virtualization must be enabled for SEV-ES
guests. Although KVM currently enforces LBRV for SEV-ES guests, there
are multiple issues with it:
o MSR_IA32_DEBUGCTLMSR is still intercepted. Since MSR_IA32_DEBUGCTLMSR
interception is used to dynamically toggle LBRV for performance reasons,
this can be fatal for SEV-ES guests. For ex SEV-ES guest on Zen3:
[guest ~]# wrmsr 0x1d9 0x4
KVM: entry failed, hardware error 0xffffffff
EAX=00000004 EBX=00000000 ECX=000001d9 EDX=00000000
Fix this by never intercepting MSR_IA32_DEBUGCTLMSR for SEV-ES guests.
No additional save/restore logic is required since MSR_IA32_DEBUGCTLMSR
is of swap type A.
o KVM will disable LBRV if userspace sets MSR_IA32_DEBUGCTLMSR before the
VMSA is encrypted. Fix this by moving LBRV enablement code post VMSA
encryption.
[1]: AMD64 Architecture Programmer's Manual Pub. 40332, Rev. 4.07 - June
2023, Vol 2, 15.35.2 Enabling SEV-ES.
https://bugzilla.kernel.org/attachment.cgi?id=304653
Fixes: 376c6d285017 ("KVM: SVM: Provide support for SEV-ES vCPU creation/loading")
Co-developed-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Message-ID: <20240531044644.768-4-ravi.bangoria@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/x86/kvm/svm/sev.c | 13 ++++++++-----
arch/x86/kvm/svm/svm.c | 8 +++++++-
arch/x86/kvm/svm/svm.h | 3 ++-
3 files changed, 17 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index f809dcfacc8a3..99e72b8a96ac0 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -664,6 +664,14 @@ static int __sev_launch_update_vmsa(struct kvm *kvm, struct kvm_vcpu *vcpu,
return ret;
vcpu->arch.guest_state_protected = true;
+
+ /*
+ * SEV-ES guest mandates LBR Virtualization to be _always_ ON. Enable it
+ * only after setting guest_state_protected because KVM_SET_MSRS allows
+ * dynamic toggling of LBRV (for performance reason) on write access to
+ * MSR_IA32_DEBUGCTLMSR when guest_state_protected is not set.
+ */
+ svm_enable_lbrv(vcpu);
return 0;
}
@@ -3035,7 +3043,6 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm)
struct kvm_vcpu *vcpu = &svm->vcpu;
svm->vmcb->control.nested_ctl |= SVM_NESTED_CTL_SEV_ES_ENABLE;
- svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK;
/*
* An SEV-ES guest requires a VMSA area that is a separate from the
@@ -3087,10 +3094,6 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm)
/* Clear intercepts on selected MSRs */
set_msr_interception(vcpu, svm->msrpm, MSR_EFER, 1, 1);
set_msr_interception(vcpu, svm->msrpm, MSR_IA32_CR_PAT, 1, 1);
- set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHFROMIP, 1, 1);
- set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHTOIP, 1, 1);
- set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTFROMIP, 1, 1);
- set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTTOIP, 1, 1);
}
void sev_init_vmcb(struct vcpu_svm *svm)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 08f1397138c80..e3c2acc1adc73 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -99,6 +99,7 @@ static const struct svm_direct_access_msrs {
{ .index = MSR_IA32_SPEC_CTRL, .always = false },
{ .index = MSR_IA32_PRED_CMD, .always = false },
{ .index = MSR_IA32_FLUSH_CMD, .always = false },
+ { .index = MSR_IA32_DEBUGCTLMSR, .always = false },
{ .index = MSR_IA32_LASTBRANCHFROMIP, .always = false },
{ .index = MSR_IA32_LASTBRANCHTOIP, .always = false },
{ .index = MSR_IA32_LASTINTFROMIP, .always = false },
@@ -1008,7 +1009,7 @@ void svm_copy_lbrs(struct vmcb *to_vmcb, struct vmcb *from_vmcb)
vmcb_mark_dirty(to_vmcb, VMCB_LBR);
}
-static void svm_enable_lbrv(struct kvm_vcpu *vcpu)
+void svm_enable_lbrv(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
@@ -1018,6 +1019,9 @@ static void svm_enable_lbrv(struct kvm_vcpu *vcpu)
set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTFROMIP, 1, 1);
set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTTOIP, 1, 1);
+ if (sev_es_guest(vcpu->kvm))
+ set_msr_interception(vcpu, svm->msrpm, MSR_IA32_DEBUGCTLMSR, 1, 1);
+
/* Move the LBR msrs to the vmcb02 so that the guest can see them. */
if (is_guest_mode(vcpu))
svm_copy_lbrs(svm->vmcb, svm->vmcb01.ptr);
@@ -1027,6 +1031,8 @@ static void svm_disable_lbrv(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
+ KVM_BUG_ON(sev_es_guest(vcpu->kvm), vcpu->kvm);
+
svm->vmcb->control.virt_ext &= ~LBR_CTL_ENABLE_MASK;
set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHFROMIP, 0, 0);
set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTBRANCHTOIP, 0, 0);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index fb0ac8497fb20..37ada9808d9b5 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -30,7 +30,7 @@
#define IOPM_SIZE PAGE_SIZE * 3
#define MSRPM_SIZE PAGE_SIZE * 2
-#define MAX_DIRECT_ACCESS_MSRS 47
+#define MAX_DIRECT_ACCESS_MSRS 48
#define MSRPM_OFFSETS 32
extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
extern bool npt_enabled;
@@ -542,6 +542,7 @@ u32 *svm_vcpu_alloc_msrpm(void);
void svm_vcpu_init_msrpm(struct kvm_vcpu *vcpu, u32 *msrpm);
void svm_vcpu_free_msrpm(u32 *msrpm);
void svm_copy_lbrs(struct vmcb *to_vmcb, struct vmcb *from_vmcb);
+void svm_enable_lbrv(struct kvm_vcpu *vcpu);
void svm_update_lbrv(struct kvm_vcpu *vcpu);
int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 027/267] vmxnet3: disable rx data ring on dma allocation failure
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (25 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 026/267] KVM: SEV-ES: Delegate LBR virtualization to the processor Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:52 ` [PATCH 6.6 028/267] ipv6: ioam: block BH from ioam6_output() Greg Kroah-Hartman
` (251 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Matthias Stocker, Subbaraya Sundeep,
Ronak Doshi, Jakub Kicinski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthias Stocker <mstocker@barracuda.com>
[ Upstream commit ffbe335b8d471f79b259e950cb20999700670456 ]
When vmxnet3_rq_create() fails to allocate memory for rq->data_ring.base,
the subsequent call to vmxnet3_rq_destroy_all_rxdataring does not reset
rq->data_ring.desc_size for the data ring that failed, which presumably
causes the hypervisor to reference it on packet reception.
To fix this bug, rq->data_ring.desc_size needs to be set to 0 to tell
the hypervisor to disable this feature.
[ 95.436876] kernel BUG at net/core/skbuff.c:207!
[ 95.439074] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 95.440411] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 6.9.3-dirty #1
[ 95.441558] Hardware name: VMware, Inc. VMware Virtual
Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
[ 95.443481] RIP: 0010:skb_panic+0x4d/0x4f
[ 95.444404] Code: 4f 70 50 8b 87 c0 00 00 00 50 8b 87 bc 00 00 00 50
ff b7 d0 00 00 00 4c 8b 8f c8 00 00 00 48 c7 c7 68 e8 be 9f e8 63 58 f9
ff <0f> 0b 48 8b 14 24 48 c7 c1 d0 73 65 9f e8 a1 ff ff ff 48 8b 14 24
[ 95.447684] RSP: 0018:ffffa13340274dd0 EFLAGS: 00010246
[ 95.448762] RAX: 0000000000000089 RBX: ffff8fbbc72b02d0 RCX: 000000000000083f
[ 95.450148] RDX: 0000000000000000 RSI: 00000000000000f6 RDI: 000000000000083f
[ 95.451520] RBP: 000000000000002d R08: 0000000000000000 R09: ffffa13340274c60
[ 95.452886] R10: ffffffffa04ed468 R11: 0000000000000002 R12: 0000000000000000
[ 95.454293] R13: ffff8fbbdab3c2d0 R14: ffff8fbbdbd829e0 R15: ffff8fbbdbd809e0
[ 95.455682] FS: 0000000000000000(0000) GS:ffff8fbeefd80000(0000) knlGS:0000000000000000
[ 95.457178] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 95.458340] CR2: 00007fd0d1f650c8 CR3: 0000000115f28000 CR4: 00000000000406f0
[ 95.459791] Call Trace:
[ 95.460515] <IRQ>
[ 95.461180] ? __die_body.cold+0x19/0x27
[ 95.462150] ? die+0x2e/0x50
[ 95.462976] ? do_trap+0xca/0x110
[ 95.463973] ? do_error_trap+0x6a/0x90
[ 95.464966] ? skb_panic+0x4d/0x4f
[ 95.465901] ? exc_invalid_op+0x50/0x70
[ 95.466849] ? skb_panic+0x4d/0x4f
[ 95.467718] ? asm_exc_invalid_op+0x1a/0x20
[ 95.468758] ? skb_panic+0x4d/0x4f
[ 95.469655] skb_put.cold+0x10/0x10
[ 95.470573] vmxnet3_rq_rx_complete+0x862/0x11e0 [vmxnet3]
[ 95.471853] vmxnet3_poll_rx_only+0x36/0xb0 [vmxnet3]
[ 95.473185] __napi_poll+0x2b/0x160
[ 95.474145] net_rx_action+0x2c6/0x3b0
[ 95.475115] handle_softirqs+0xe7/0x2a0
[ 95.476122] __irq_exit_rcu+0x97/0xb0
[ 95.477109] common_interrupt+0x85/0xa0
[ 95.478102] </IRQ>
[ 95.478846] <TASK>
[ 95.479603] asm_common_interrupt+0x26/0x40
[ 95.480657] RIP: 0010:pv_native_safe_halt+0xf/0x20
[ 95.481801] Code: 22 d7 e9 54 87 01 00 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d 93 ba 3b 00 fb f4 <e9> 2c 87 01 00 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90
[ 95.485563] RSP: 0018:ffffa133400ffe58 EFLAGS: 00000246
[ 95.486882] RAX: 0000000000004000 RBX: ffff8fbbc1d14064 RCX: 0000000000000000
[ 95.488477] RDX: ffff8fbeefd80000 RSI: ffff8fbbc1d14000 RDI: 0000000000000001
[ 95.490067] RBP: ffff8fbbc1d14064 R08: ffffffffa0652260 R09: 00000000000010d3
[ 95.491683] R10: 0000000000000018 R11: ffff8fbeefdb4764 R12: ffffffffa0652260
[ 95.493389] R13: ffffffffa06522e0 R14: 0000000000000001 R15: 0000000000000000
[ 95.495035] acpi_safe_halt+0x14/0x20
[ 95.496127] acpi_idle_do_entry+0x2f/0x50
[ 95.497221] acpi_idle_enter+0x7f/0xd0
[ 95.498272] cpuidle_enter_state+0x81/0x420
[ 95.499375] cpuidle_enter+0x2d/0x40
[ 95.500400] do_idle+0x1e5/0x240
[ 95.501385] cpu_startup_entry+0x29/0x30
[ 95.502422] start_secondary+0x11c/0x140
[ 95.503454] common_startup_64+0x13e/0x141
[ 95.504466] </TASK>
[ 95.505197] Modules linked in: nft_fib_inet nft_fib_ipv4
nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6
nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6
nf_defrag_ipv4 rfkill ip_set nf_tables vsock_loopback
vmw_vsock_virtio_transport_common qrtr vmw_vsock_vmci_transport vsock
sunrpc binfmt_misc pktcdvd vmw_balloon pcspkr vmw_vmci i2c_piix4 joydev
loop dm_multipath nfnetlink zram crct10dif_pclmul crc32_pclmul vmwgfx
crc32c_intel polyval_clmulni polyval_generic ghash_clmulni_intel
sha512_ssse3 sha256_ssse3 vmxnet3 sha1_ssse3 drm_ttm_helper vmw_pvscsi
ttm ata_generic pata_acpi serio_raw scsi_dh_rdac scsi_dh_emc
scsi_dh_alua ip6_tables ip_tables fuse
[ 95.516536] ---[ end trace 0000000000000000 ]---
Fixes: 6f4833383e85 ("net: vmxnet3: Fix NULL pointer dereference in vmxnet3_rq_rx_complete()")
Signed-off-by: Matthias Stocker <mstocker@barracuda.com>
Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Reviewed-by: Ronak Doshi <ronak.doshi@broadcom.com>
Link: https://lore.kernel.org/r/20240531103711.101961-1-mstocker@barracuda.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/vmxnet3/vmxnet3_drv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c b/drivers/net/vmxnet3/vmxnet3_drv.c
index 0578864792b60..beebe09eb88ff 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -2034,8 +2034,8 @@ vmxnet3_rq_destroy_all_rxdataring(struct vmxnet3_adapter *adapter)
rq->data_ring.base,
rq->data_ring.basePA);
rq->data_ring.base = NULL;
- rq->data_ring.desc_size = 0;
}
+ rq->data_ring.desc_size = 0;
}
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 028/267] ipv6: ioam: block BH from ioam6_output()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (26 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 027/267] vmxnet3: disable rx data ring on dma allocation failure Greg Kroah-Hartman
@ 2024-06-19 12:52 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 029/267] ipv6: sr: block BH in seg6_output_core() and seg6_input_core() Greg Kroah-Hartman
` (250 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:52 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Eric Dumazet, Justin Iurman,
Paolo Abeni, Jakub Kicinski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet <edumazet@google.com>
[ Upstream commit 2fe40483ec257de2a0d819ef88e3e76c7e261319 ]
As explained in commit 1378817486d6 ("tipc: block BH
before using dst_cache"), net/core/dst_cache.c
helpers need to be called with BH disabled.
Disabling preemption in ioam6_output() is not good enough,
because ioam6_output() is called from process context,
lwtunnel_output() only uses rcu_read_lock().
We might be interrupted by a softirq, re-enter ioam6_output()
and corrupt dst_cache data structures.
Fix the race by using local_bh_disable() instead of
preempt_disable().
Fixes: 8cb3bf8bff3c ("ipv6: ioam: Add support for the ip6ip6 encapsulation")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Justin Iurman <justin.iurman@uliege.be>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/20240531132636.2637995-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/ipv6/ioam6_iptunnel.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/ipv6/ioam6_iptunnel.c b/net/ipv6/ioam6_iptunnel.c
index f6f5b83dd954d..a5cfc5b0b206b 100644
--- a/net/ipv6/ioam6_iptunnel.c
+++ b/net/ipv6/ioam6_iptunnel.c
@@ -351,9 +351,9 @@ static int ioam6_output(struct net *net, struct sock *sk, struct sk_buff *skb)
goto drop;
if (!ipv6_addr_equal(&orig_daddr, &ipv6_hdr(skb)->daddr)) {
- preempt_disable();
+ local_bh_disable();
dst = dst_cache_get(&ilwt->cache);
- preempt_enable();
+ local_bh_enable();
if (unlikely(!dst)) {
struct ipv6hdr *hdr = ipv6_hdr(skb);
@@ -373,9 +373,9 @@ static int ioam6_output(struct net *net, struct sock *sk, struct sk_buff *skb)
goto drop;
}
- preempt_disable();
+ local_bh_disable();
dst_cache_set_ip6(&ilwt->cache, dst, &fl6.saddr);
- preempt_enable();
+ local_bh_enable();
}
skb_dst_drop(skb);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 029/267] ipv6: sr: block BH in seg6_output_core() and seg6_input_core()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (27 preceding siblings ...)
2024-06-19 12:52 ` [PATCH 6.6 028/267] ipv6: ioam: block BH from ioam6_output() Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 030/267] net: tls: fix marking packets as decrypted Greg Kroah-Hartman
` (249 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Eric Dumazet, David Lebrun,
Paolo Abeni, Jakub Kicinski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet <edumazet@google.com>
[ Upstream commit c0b98ac1cc104f48763cdb27b1e9ac25fd81fc90 ]
As explained in commit 1378817486d6 ("tipc: block BH
before using dst_cache"), net/core/dst_cache.c
helpers need to be called with BH disabled.
Disabling preemption in seg6_output_core() is not good enough,
because seg6_output_core() is called from process context,
lwtunnel_output() only uses rcu_read_lock().
We might be interrupted by a softirq, re-enter seg6_output_core()
and corrupt dst_cache data structures.
Fix the race by using local_bh_disable() instead of
preempt_disable().
Apply a similar change in seg6_input_core().
Fixes: fa79581ea66c ("ipv6: sr: fix several BUGs when preemption is enabled")
Fixes: 6c8702c60b88 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: David Lebrun <dlebrun@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/20240531132636.2637995-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/ipv6/seg6_iptunnel.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/net/ipv6/seg6_iptunnel.c b/net/ipv6/seg6_iptunnel.c
index a75df2ec8db0d..098632adc9b5a 100644
--- a/net/ipv6/seg6_iptunnel.c
+++ b/net/ipv6/seg6_iptunnel.c
@@ -464,23 +464,21 @@ static int seg6_input_core(struct net *net, struct sock *sk,
slwt = seg6_lwt_lwtunnel(orig_dst->lwtstate);
- preempt_disable();
+ local_bh_disable();
dst = dst_cache_get(&slwt->cache);
- preempt_enable();
if (!dst) {
ip6_route_input(skb);
dst = skb_dst(skb);
if (!dst->error) {
- preempt_disable();
dst_cache_set_ip6(&slwt->cache, dst,
&ipv6_hdr(skb)->saddr);
- preempt_enable();
}
} else {
skb_dst_drop(skb);
skb_dst_set(skb, dst);
}
+ local_bh_enable();
err = skb_cow_head(skb, LL_RESERVED_SPACE(dst->dev));
if (unlikely(err))
@@ -536,9 +534,9 @@ static int seg6_output_core(struct net *net, struct sock *sk,
slwt = seg6_lwt_lwtunnel(orig_dst->lwtstate);
- preempt_disable();
+ local_bh_disable();
dst = dst_cache_get(&slwt->cache);
- preempt_enable();
+ local_bh_enable();
if (unlikely(!dst)) {
struct ipv6hdr *hdr = ipv6_hdr(skb);
@@ -558,9 +556,9 @@ static int seg6_output_core(struct net *net, struct sock *sk,
goto drop;
}
- preempt_disable();
+ local_bh_disable();
dst_cache_set_ip6(&slwt->cache, dst, &fl6.saddr);
- preempt_enable();
+ local_bh_enable();
}
skb_dst_drop(skb);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 030/267] net: tls: fix marking packets as decrypted
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (28 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 029/267] ipv6: sr: block BH in seg6_output_core() and seg6_input_core() Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 031/267] bpf: Set run context for rawtp test_run callback Greg Kroah-Hartman
` (248 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Jakub Kicinski, Eric Dumazet,
Paolo Abeni, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jakub Kicinski <kuba@kernel.org>
[ Upstream commit a535d59432370343058755100ee75ab03c0e3f91 ]
For TLS offload we mark packets with skb->decrypted to make sure
they don't escape the host without getting encrypted first.
The crypto state lives in the socket, so it may get detached
by a call to skb_orphan(). As a safety check - the egress path
drops all packets with skb->decrypted and no "crypto-safe" socket.
The skb marking was added to sendpage only (and not sendmsg),
because tls_device injected data into the TCP stack using sendpage.
This special case was missed when sendpage got folded into sendmsg.
Fixes: c5c37af6ecad ("tcp: Convert do_tcp_sendpages() to use MSG_SPLICE_PAGES")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240530232607.82686-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/ipv4/tcp.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 7bf774bdb9386..a9b33135513d8 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1158,6 +1158,9 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
process_backlog++;
+#ifdef CONFIG_SKB_DECRYPTED
+ skb->decrypted = !!(flags & MSG_SENDPAGE_DECRYPTED);
+#endif
tcp_skb_entail(sk, skb);
copy = size_goal;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 031/267] bpf: Set run context for rawtp test_run callback
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (29 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 030/267] net: tls: fix marking packets as decrypted Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 032/267] octeontx2-af: Always allocate PF entries from low prioriy zone Greg Kroah-Hartman
` (247 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, syzbot+3ab78ff125b7979e45f9,
Jiri Olsa, Andrii Nakryiko, Daniel Borkmann, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiri Olsa <jolsa@kernel.org>
[ Upstream commit d0d1df8ba18abc57f28fb3bc053b2bf319367f2c ]
syzbot reported crash when rawtp program executed through the
test_run interface calls bpf_get_attach_cookie helper or any
other helper that touches task->bpf_ctx pointer.
Setting the run context (task->bpf_ctx pointer) for test_run
callback.
Fixes: 7adfc6c9b315 ("bpf: Add bpf_get_attach_cookie() BPF helper to access bpf_cookie value")
Reported-by: syzbot+3ab78ff125b7979e45f9@syzkaller.appspotmail.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Closes: https://syzkaller.appspot.com/bug?extid=3ab78ff125b7979e45f9
Link: https://lore.kernel.org/bpf/20240604150024.359247-1-jolsa@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/bpf/test_run.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 478ee7aba85f3..12a2934b28ffb 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -707,10 +707,16 @@ static void
__bpf_prog_test_run_raw_tp(void *data)
{
struct bpf_raw_tp_test_run_info *info = data;
+ struct bpf_trace_run_ctx run_ctx = {};
+ struct bpf_run_ctx *old_run_ctx;
+
+ old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
rcu_read_lock();
info->retval = bpf_prog_run(info->prog, info->ctx);
rcu_read_unlock();
+
+ bpf_reset_run_ctx(old_run_ctx);
}
int bpf_prog_test_run_raw_tp(struct bpf_prog *prog,
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 032/267] octeontx2-af: Always allocate PF entries from low prioriy zone
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (30 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 031/267] bpf: Set run context for rawtp test_run callback Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 033/267] net/smc: avoid overwriting when adjusting sock bufsizes Greg Kroah-Hartman
` (246 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Subbaraya Sundeep, Jacob Keller,
David S. Miller, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Subbaraya Sundeep <sbhatta@marvell.com>
[ Upstream commit 8b0f7410942cdc420c4557eda02bfcdf60ccec17 ]
PF mcam entries has to be at low priority always so that VF
can install longest prefix match rules at higher priority.
This was taken care currently but when priority allocation
wrt reference entry is requested then entries are allocated
from mid-zone instead of low priority zone. Fix this and
always allocate entries from low priority zone for PFs.
Fixes: 7df5b4b260dd ("octeontx2-af: Allocate low priority entries for PF")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
.../ethernet/marvell/octeontx2/af/rvu_npc.c | 33 ++++++++++++-------
1 file changed, 22 insertions(+), 11 deletions(-)
diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
index 91a4ea529d077..00ef6d201b973 100644
--- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
+++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c
@@ -2506,7 +2506,17 @@ static int npc_mcam_alloc_entries(struct npc_mcam *mcam, u16 pcifunc,
* - when available free entries are less.
* Lower priority ones out of avaialble free entries are always
* chosen when 'high vs low' question arises.
+ *
+ * For a VF base MCAM match rule is set by its PF. And all the
+ * further MCAM rules installed by VF on its own are
+ * concatenated with the base rule set by its PF. Hence PF entries
+ * should be at lower priority compared to VF entries. Otherwise
+ * base rule is hit always and rules installed by VF will be of
+ * no use. Hence if the request is from PF then allocate low
+ * priority entries.
*/
+ if (!(pcifunc & RVU_PFVF_FUNC_MASK))
+ goto lprio_alloc;
/* Get the search range for priority allocation request */
if (req->priority) {
@@ -2515,17 +2525,6 @@ static int npc_mcam_alloc_entries(struct npc_mcam *mcam, u16 pcifunc,
goto alloc;
}
- /* For a VF base MCAM match rule is set by its PF. And all the
- * further MCAM rules installed by VF on its own are
- * concatenated with the base rule set by its PF. Hence PF entries
- * should be at lower priority compared to VF entries. Otherwise
- * base rule is hit always and rules installed by VF will be of
- * no use. Hence if the request is from PF and NOT a priority
- * allocation request then allocate low priority entries.
- */
- if (!(pcifunc & RVU_PFVF_FUNC_MASK))
- goto lprio_alloc;
-
/* Find out the search range for non-priority allocation request
*
* Get MCAM free entry count in middle zone.
@@ -2555,6 +2554,18 @@ static int npc_mcam_alloc_entries(struct npc_mcam *mcam, u16 pcifunc,
reverse = true;
start = 0;
end = mcam->bmap_entries;
+ /* Ensure PF requests are always at bottom and if PF requests
+ * for higher/lower priority entry wrt reference entry then
+ * honour that criteria and start search for entries from bottom
+ * and not in mid zone.
+ */
+ if (!(pcifunc & RVU_PFVF_FUNC_MASK) &&
+ req->priority == NPC_MCAM_HIGHER_PRIO)
+ end = req->ref_entry;
+
+ if (!(pcifunc & RVU_PFVF_FUNC_MASK) &&
+ req->priority == NPC_MCAM_LOWER_PRIO)
+ start = req->ref_entry;
}
alloc:
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 033/267] net/smc: avoid overwriting when adjusting sock bufsizes
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (31 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 032/267] octeontx2-af: Always allocate PF entries from low prioriy zone Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 034/267] net: phy: Micrel KSZ8061: fix errata solution not taking effect problem Greg Kroah-Hartman
` (245 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Wen Gu, Wenjia Zhang,
David S. Miller, Sasha Levin, Gerd Bayer
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wen Gu <guwen@linux.alibaba.com>
[ Upstream commit fb0aa0781a5f457e3864da68af52c3b1f4f7fd8f ]
When copying smc settings to clcsock, avoid setting clcsock's sk_sndbuf
to sysctl_tcp_wmem[1], since this may overwrite the value set by
tcp_sndbuf_expand() in TCP connection establishment.
And the other setting sk_{snd|rcv}buf to sysctl value in
smc_adjust_sock_bufsizes() can also be omitted since the initialization
of smc sock and clcsock has set sk_{snd|rcv}buf to smc.sysctl_{w|r}mem
or ipv4_sysctl_tcp_{w|r}mem[1].
Fixes: 30c3c4a4497c ("net/smc: Use correct buffer sizes when switching between TCP and SMC")
Link: https://lore.kernel.org/r/5eaf3858-e7fd-4db8-83e8-3d7a3e0e9ae2@linux.alibaba.com
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>, too.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/smc/af_smc.c | 22 ++--------------------
1 file changed, 2 insertions(+), 20 deletions(-)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index ef5b5d498ef3e..3158b94fd347a 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -460,29 +460,11 @@ static int smc_bind(struct socket *sock, struct sockaddr *uaddr,
static void smc_adjust_sock_bufsizes(struct sock *nsk, struct sock *osk,
unsigned long mask)
{
- struct net *nnet = sock_net(nsk);
-
nsk->sk_userlocks = osk->sk_userlocks;
- if (osk->sk_userlocks & SOCK_SNDBUF_LOCK) {
+ if (osk->sk_userlocks & SOCK_SNDBUF_LOCK)
nsk->sk_sndbuf = osk->sk_sndbuf;
- } else {
- if (mask == SK_FLAGS_SMC_TO_CLC)
- WRITE_ONCE(nsk->sk_sndbuf,
- READ_ONCE(nnet->ipv4.sysctl_tcp_wmem[1]));
- else
- WRITE_ONCE(nsk->sk_sndbuf,
- 2 * READ_ONCE(nnet->smc.sysctl_wmem));
- }
- if (osk->sk_userlocks & SOCK_RCVBUF_LOCK) {
+ if (osk->sk_userlocks & SOCK_RCVBUF_LOCK)
nsk->sk_rcvbuf = osk->sk_rcvbuf;
- } else {
- if (mask == SK_FLAGS_SMC_TO_CLC)
- WRITE_ONCE(nsk->sk_rcvbuf,
- READ_ONCE(nnet->ipv4.sysctl_tcp_rmem[1]));
- else
- WRITE_ONCE(nsk->sk_rcvbuf,
- 2 * READ_ONCE(nnet->smc.sysctl_rmem));
- }
}
static void smc_copy_sock_settings(struct sock *nsk, struct sock *osk,
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 034/267] net: phy: Micrel KSZ8061: fix errata solution not taking effect problem
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (32 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 033/267] net/smc: avoid overwriting when adjusting sock bufsizes Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 035/267] net: sched: sch_multiq: fix possible OOB write in multiq_tune() Greg Kroah-Hartman
` (244 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Tristram Ha, David S. Miller,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tristram Ha <tristram.ha@microchip.com>
[ Upstream commit 0a8d3f2e3e8d8aea8af017e14227b91d5989b696 ]
KSZ8061 needs to write to a MMD register at driver initialization to fix
an errata. This worked in 5.0 kernel but not in newer kernels. The
issue is the main phylib code no longer resets PHY at the very beginning.
Calling phy resuming code later will reset the chip if it is already
powered down at the beginning. This wipes out the MMD register write.
Solution is to implement a phy resume function for KSZ8061 to take care
of this problem.
Fixes: 232ba3a51cc2 ("net: phy: Micrel KSZ8061: link failure after cable connect")
Signed-off-by: Tristram Ha <tristram.ha@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/phy/micrel.c | 42 +++++++++++++++++++++++++++++++++++++++-
1 file changed, 41 insertions(+), 1 deletion(-)
diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
index 048704758b150..366ae22534373 100644
--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
@@ -770,6 +770,17 @@ static int ksz8061_config_init(struct phy_device *phydev)
{
int ret;
+ /* Chip can be powered down by the bootstrap code. */
+ ret = phy_read(phydev, MII_BMCR);
+ if (ret < 0)
+ return ret;
+ if (ret & BMCR_PDOWN) {
+ ret = phy_write(phydev, MII_BMCR, ret & ~BMCR_PDOWN);
+ if (ret < 0)
+ return ret;
+ usleep_range(1000, 2000);
+ }
+
ret = phy_write_mmd(phydev, MDIO_MMD_PMAPMD, MDIO_DEVID1, 0xB61A);
if (ret)
return ret;
@@ -2017,6 +2028,35 @@ static int ksz9477_resume(struct phy_device *phydev)
return 0;
}
+static int ksz8061_resume(struct phy_device *phydev)
+{
+ int ret;
+
+ /* This function can be called twice when the Ethernet device is on. */
+ ret = phy_read(phydev, MII_BMCR);
+ if (ret < 0)
+ return ret;
+ if (!(ret & BMCR_PDOWN))
+ return 0;
+
+ genphy_resume(phydev);
+ usleep_range(1000, 2000);
+
+ /* Re-program the value after chip is reset. */
+ ret = phy_write_mmd(phydev, MDIO_MMD_PMAPMD, MDIO_DEVID1, 0xB61A);
+ if (ret)
+ return ret;
+
+ /* Enable PHY Interrupts */
+ if (phy_interrupt_is_valid(phydev)) {
+ phydev->interrupts = PHY_INTERRUPT_ENABLED;
+ if (phydev->drv->config_intr)
+ phydev->drv->config_intr(phydev);
+ }
+
+ return 0;
+}
+
static int kszphy_probe(struct phy_device *phydev)
{
const struct kszphy_type *type = phydev->drv->driver_data;
@@ -4812,7 +4852,7 @@ static struct phy_driver ksphy_driver[] = {
.config_intr = kszphy_config_intr,
.handle_interrupt = kszphy_handle_interrupt,
.suspend = kszphy_suspend,
- .resume = kszphy_resume,
+ .resume = ksz8061_resume,
}, {
.phy_id = PHY_ID_KSZ9021,
.phy_id_mask = 0x000ffffe,
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 035/267] net: sched: sch_multiq: fix possible OOB write in multiq_tune()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (33 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 034/267] net: phy: Micrel KSZ8061: fix errata solution not taking effect problem Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 036/267] vxlan: Fix regression when dropping packets due to invalid src addresses Greg Kroah-Hartman
` (243 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Hangyu Hua, Cong Wang,
David S. Miller, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hangyu Hua <hbh25y@gmail.com>
[ Upstream commit affc18fdc694190ca7575b9a86632a73b9fe043d ]
q->bands will be assigned to qopt->bands to execute subsequent code logic
after kmalloc. So the old q->bands should not be used in kmalloc.
Otherwise, an out-of-bounds write will occur.
Fixes: c2999f7fb05b ("net: sched: multiq: don't call qdisc_put() while holding tree lock")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Acked-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/sched/sch_multiq.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/sch_multiq.c b/net/sched/sch_multiq.c
index 75c9c860182b4..0d6649d937c9f 100644
--- a/net/sched/sch_multiq.c
+++ b/net/sched/sch_multiq.c
@@ -185,7 +185,7 @@ static int multiq_tune(struct Qdisc *sch, struct nlattr *opt,
qopt->bands = qdisc_dev(sch)->real_num_tx_queues;
- removed = kmalloc(sizeof(*removed) * (q->max_bands - q->bands),
+ removed = kmalloc(sizeof(*removed) * (q->max_bands - qopt->bands),
GFP_KERNEL);
if (!removed)
return -ENOMEM;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 036/267] vxlan: Fix regression when dropping packets due to invalid src addresses
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (34 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 035/267] net: sched: sch_multiq: fix possible OOB write in multiq_tune() Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 037/267] tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB Greg Kroah-Hartman
` (242 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Daniel Borkmann, David Bauer,
Ido Schimmel, Nikolay Aleksandrov, Martin KaFai Lau,
David S. Miller, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Borkmann <daniel@iogearbox.net>
[ Upstream commit 1cd4bc987abb2823836cbb8f887026011ccddc8a ]
Commit f58f45c1e5b9 ("vxlan: drop packets from invalid src-address")
has recently been added to vxlan mainly in the context of source
address snooping/learning so that when it is enabled, an entry in the
FDB is not being created for an invalid address for the corresponding
tunnel endpoint.
Before commit f58f45c1e5b9 vxlan was similarly behaving as geneve in
that it passed through whichever macs were set in the L2 header. It
turns out that this change in behavior breaks setups, for example,
Cilium with netkit in L3 mode for Pods as well as tunnel mode has been
passing before the change in f58f45c1e5b9 for both vxlan and geneve.
After mentioned change it is only passing for geneve as in case of
vxlan packets are dropped due to vxlan_set_mac() returning false as
source and destination macs are zero which for E/W traffic via tunnel
is totally fine.
Fix it by only opting into the is_valid_ether_addr() check in
vxlan_set_mac() when in fact source address snooping/learning is
actually enabled in vxlan. This is done by moving the check into
vxlan_snoop(). With this change, the Cilium connectivity test suite
passes again for both tunnel flavors.
Fixes: f58f45c1e5b9 ("vxlan: drop packets from invalid src-address")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Bauer <mail@david-bauer.net>
Cc: Ido Schimmel <idosch@nvidia.com>
Cc: Nikolay Aleksandrov <razor@blackwall.org>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: David Bauer <mail@david-bauer.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/vxlan/vxlan_core.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index c24ff08abe0da..8268fa331826e 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -1446,6 +1446,10 @@ static bool vxlan_snoop(struct net_device *dev,
struct vxlan_fdb *f;
u32 ifindex = 0;
+ /* Ignore packets from invalid src-address */
+ if (!is_valid_ether_addr(src_mac))
+ return true;
+
#if IS_ENABLED(CONFIG_IPV6)
if (src_ip->sa.sa_family == AF_INET6 &&
(ipv6_addr_type(&src_ip->sin6.sin6_addr) & IPV6_ADDR_LINKLOCAL))
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 037/267] tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (35 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 036/267] vxlan: Fix regression when dropping packets due to invalid src addresses Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 038/267] mptcp: count CLOSE-WAIT sockets for MPTCP_MIB_CURRESTAB Greg Kroah-Hartman
` (241 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Jason Xing, Eric Dumazet,
David S. Miller, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason Xing <kernelxing@tencent.com>
[ Upstream commit a46d0ea5c94205f40ecf912d1bb7806a8a64704f ]
According to RFC 1213, we should also take CLOSE-WAIT sockets into
consideration:
"tcpCurrEstab OBJECT-TYPE
...
The number of TCP connections for which the current state
is either ESTABLISHED or CLOSE- WAIT."
After this, CurrEstab counter will display the total number of
ESTABLISHED and CLOSE-WAIT sockets.
The logic of counting
When we increment the counter?
a) if we change the state to ESTABLISHED.
b) if we change the state from SYN-RECEIVED to CLOSE-WAIT.
When we decrement the counter?
a) if the socket leaves ESTABLISHED and will never go into CLOSE-WAIT,
say, on the client side, changing from ESTABLISHED to FIN-WAIT-1.
b) if the socket leaves CLOSE-WAIT, say, on the server side, changing
from CLOSE-WAIT to LAST-ACK.
Please note: there are two chances that old state of socket can be changed
to CLOSE-WAIT in tcp_fin(). One is SYN-RECV, the other is ESTABLISHED.
So we have to take care of the former case.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/ipv4/tcp.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index a9b33135513d8..2df05ea2e00fe 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2640,6 +2640,10 @@ void tcp_set_state(struct sock *sk, int state)
if (oldstate != TCP_ESTABLISHED)
TCP_INC_STATS(sock_net(sk), TCP_MIB_CURRESTAB);
break;
+ case TCP_CLOSE_WAIT:
+ if (oldstate == TCP_SYN_RECV)
+ TCP_INC_STATS(sock_net(sk), TCP_MIB_CURRESTAB);
+ break;
case TCP_CLOSE:
if (oldstate == TCP_CLOSE_WAIT || oldstate == TCP_ESTABLISHED)
@@ -2651,7 +2655,7 @@ void tcp_set_state(struct sock *sk, int state)
inet_put_port(sk);
fallthrough;
default:
- if (oldstate == TCP_ESTABLISHED)
+ if (oldstate == TCP_ESTABLISHED || oldstate == TCP_CLOSE_WAIT)
TCP_DEC_STATS(sock_net(sk), TCP_MIB_CURRESTAB);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 038/267] mptcp: count CLOSE-WAIT sockets for MPTCP_MIB_CURRESTAB
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (36 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 037/267] tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 039/267] net/mlx5: Stop waiting for PCI if pci channel is offline Greg Kroah-Hartman
` (240 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Jason Xing, Matthieu Baerts (NGI0),
David S. Miller, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason Xing <kernelxing@tencent.com>
[ Upstream commit 9633e9377e6af0244f7381e86b9aac5276f5be97 ]
Like previous patch does in TCP, we need to adhere to RFC 1213:
"tcpCurrEstab OBJECT-TYPE
...
The number of TCP connections for which the current state
is either ESTABLISHED or CLOSE- WAIT."
So let's consider CLOSE-WAIT sockets.
The logic of counting
When we increment the counter?
a) Only if we change the state to ESTABLISHED.
When we decrement the counter?
a) if the socket leaves ESTABLISHED and will never go into CLOSE-WAIT,
say, on the client side, changing from ESTABLISHED to FIN-WAIT-1.
b) if the socket leaves CLOSE-WAIT, say, on the server side, changing
from CLOSE-WAIT to LAST-ACK.
Fixes: d9cd27b8cd19 ("mptcp: add CurrEstab MIB counter support")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/mptcp/protocol.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 618d80112d1e2..4ace52e4211ad 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2873,9 +2873,14 @@ void mptcp_set_state(struct sock *sk, int state)
if (oldstate != TCP_ESTABLISHED)
MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_CURRESTAB);
break;
-
+ case TCP_CLOSE_WAIT:
+ /* Unlike TCP, MPTCP sk would not have the TCP_SYN_RECV state:
+ * MPTCP "accepted" sockets will be created later on. So no
+ * transition from TCP_SYN_RECV to TCP_CLOSE_WAIT.
+ */
+ break;
default:
- if (oldstate == TCP_ESTABLISHED)
+ if (oldstate == TCP_ESTABLISHED || oldstate == TCP_CLOSE_WAIT)
MPTCP_DEC_STATS(sock_net(sk), MPTCP_MIB_CURRESTAB);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 039/267] net/mlx5: Stop waiting for PCI if pci channel is offline
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (37 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 038/267] mptcp: count CLOSE-WAIT sockets for MPTCP_MIB_CURRESTAB Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 040/267] net/mlx5: Always stop health timer during driver removal Greg Kroah-Hartman
` (239 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Moshe Shemesh, Shay Drori,
Tariq Toukan, David S. Miller, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Moshe Shemesh <moshe@nvidia.com>
[ Upstream commit 33afbfcc105a572159750f2ebee834a8a70fdd96 ]
In case pci channel becomes offline the driver should not wait for PCI
reads during health dump and recovery flow. The driver has timeout for
each of these loops trying to read PCI, so it would fail anyway.
However, in case of recovery waiting till timeout may cause the pci
error_detected() callback fail to meet pci_dpc_recovered() wait timeout.
Fixes: b3bd076f7501 ("net/mlx5: Report devlink health on FW fatal issues")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shay Drori <shayd@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/mellanox/mlx5/core/fw.c | 4 ++++
drivers/net/ethernet/mellanox/mlx5/core/health.c | 8 ++++++++
drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c | 4 ++++
3 files changed, 16 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw.c b/drivers/net/ethernet/mellanox/mlx5/core/fw.c
index 58f4c0d0fafa2..70898f0a9866c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw.c
@@ -373,6 +373,10 @@ int mlx5_cmd_fast_teardown_hca(struct mlx5_core_dev *dev)
do {
if (mlx5_get_nic_state(dev) == MLX5_NIC_IFC_DISABLED)
break;
+ if (pci_channel_offline(dev->pdev)) {
+ mlx5_core_err(dev, "PCI channel offline, stop waiting for NIC IFC\n");
+ return -EACCES;
+ }
cond_resched();
} while (!time_after(jiffies, end));
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 2fb2598b775ef..d798834c4e755 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -248,6 +248,10 @@ void mlx5_error_sw_reset(struct mlx5_core_dev *dev)
do {
if (mlx5_get_nic_state(dev) == MLX5_NIC_IFC_DISABLED)
break;
+ if (pci_channel_offline(dev->pdev)) {
+ mlx5_core_err(dev, "PCI channel offline, stop waiting for NIC IFC\n");
+ goto unlock;
+ }
msleep(20);
} while (!time_after(jiffies, end));
@@ -317,6 +321,10 @@ int mlx5_health_wait_pci_up(struct mlx5_core_dev *dev)
mlx5_core_warn(dev, "device is being removed, stop waiting for PCI\n");
return -ENODEV;
}
+ if (pci_channel_offline(dev->pdev)) {
+ mlx5_core_err(dev, "PCI channel offline, stop waiting for PCI\n");
+ return -EACCES;
+ }
msleep(100);
}
return 0;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c
index 6b774e0c27665..d0b595ba61101 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.c
@@ -74,6 +74,10 @@ int mlx5_vsc_gw_lock(struct mlx5_core_dev *dev)
ret = -EBUSY;
goto pci_unlock;
}
+ if (pci_channel_offline(dev->pdev)) {
+ ret = -EACCES;
+ goto pci_unlock;
+ }
/* Check if semaphore is already locked */
ret = vsc_read(dev, VSC_SEMAPHORE_OFFSET, &lock_val);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 040/267] net/mlx5: Always stop health timer during driver removal
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (38 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 039/267] net/mlx5: Stop waiting for PCI if pci channel is offline Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 041/267] net/mlx5: Fix tainted pointer delete is case of flow rules creation fail Greg Kroah-Hartman
` (238 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Shay Drory, Moshe Shemesh,
Tariq Toukan, David S. Miller, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shay Drory <shayd@nvidia.com>
[ Upstream commit c8b3f38d2dae0397944814d691a419c451f9906f ]
Currently, if teardown_hca fails to execute during driver removal, mlx5
does not stop the health timer. Afterwards, mlx5 continue with driver
teardown. This may lead to a UAF bug, which results in page fault
Oops[1], since the health timer invokes after resources were freed.
Hence, stop the health monitor even if teardown_hca fails.
[1]
mlx5_core 0000:18:00.0: E-Switch: Unload vfs: mode(LEGACY), nvfs(0), necvfs(0), active vports(0)
mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0)
mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0)
mlx5_core 0000:18:00.0: E-Switch: cleanup
mlx5_core 0000:18:00.0: wait_func:1155:(pid 1967079): TEARDOWN_HCA(0x103) timeout. Will cause a leak of a command resource
mlx5_core 0000:18:00.0: mlx5_function_close:1288:(pid 1967079): tear_down_hca failed, skip cleanup
BUG: unable to handle page fault for address: ffffa26487064230
PGD 100c00067 P4D 100c00067 PUD 100e5a067 PMD 105ed7067 PTE 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE ------- --- 6.7.0-68.fc38.x86_64 #1
Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0013.121520200651 12/15/2020
RIP: 0010:ioread32be+0x34/0x60
RSP: 0018:ffffa26480003e58 EFLAGS: 00010292
RAX: ffffa26487064200 RBX: ffff9042d08161a0 RCX: ffff904c108222c0
RDX: 000000010bbf1b80 RSI: ffffffffc055ddb0 RDI: ffffa26487064230
RBP: ffff9042d08161a0 R08: 0000000000000022 R09: ffff904c108222e8
R10: 0000000000000004 R11: 0000000000000441 R12: ffffffffc055ddb0
R13: ffffa26487064200 R14: ffffa26480003f00 R15: ffff904c108222c0
FS: 0000000000000000(0000) GS:ffff904c10800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffa26487064230 CR3: 00000002c4420006 CR4: 00000000007706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<IRQ>
? __die+0x23/0x70
? page_fault_oops+0x171/0x4e0
? exc_page_fault+0x175/0x180
? asm_exc_page_fault+0x26/0x30
? __pfx_poll_health+0x10/0x10 [mlx5_core]
? __pfx_poll_health+0x10/0x10 [mlx5_core]
? ioread32be+0x34/0x60
mlx5_health_check_fatal_sensors+0x20/0x100 [mlx5_core]
? __pfx_poll_health+0x10/0x10 [mlx5_core]
poll_health+0x42/0x230 [mlx5_core]
? __next_timer_interrupt+0xbc/0x110
? __pfx_poll_health+0x10/0x10 [mlx5_core]
call_timer_fn+0x21/0x130
? __pfx_poll_health+0x10/0x10 [mlx5_core]
__run_timers+0x222/0x2c0
run_timer_softirq+0x1d/0x40
__do_softirq+0xc9/0x2c8
__irq_exit_rcu+0xa6/0xc0
sysvec_apic_timer_interrupt+0x72/0x90
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1a/0x20
RIP: 0010:cpuidle_enter_state+0xcc/0x440
? cpuidle_enter_state+0xbd/0x440
cpuidle_enter+0x2d/0x40
do_idle+0x20d/0x270
cpu_startup_entry+0x2a/0x30
rest_init+0xd0/0xd0
arch_call_rest_init+0xe/0x30
start_kernel+0x709/0xa90
x86_64_start_reservations+0x18/0x30
x86_64_start_kernel+0x96/0xa0
secondary_startup_64_no_verify+0x18f/0x19b
---[ end trace 0000000000000000 ]---
Fixes: 9b98d395b85d ("net/mlx5: Start health poll at earlier stage of driver load")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/mellanox/mlx5/core/main.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 9710ddac1f1a8..2237b3d01e0e5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1287,6 +1287,9 @@ static int mlx5_function_teardown(struct mlx5_core_dev *dev, bool boot)
if (!err)
mlx5_function_disable(dev, boot);
+ else
+ mlx5_stop_health_poll(dev, boot);
+
return err;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 041/267] net/mlx5: Fix tainted pointer delete is case of flow rules creation fail
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (39 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 040/267] net/mlx5: Always stop health timer during driver removal Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 042/267] net/sched: taprio: always validate TCA_TAPRIO_ATTR_PRIOMAP Greg Kroah-Hartman
` (237 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Aleksandr Mishin, Jacob Keller,
Tariq Toukan, Jakub Kicinski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aleksandr Mishin <amishin@t-argos.ru>
[ Upstream commit 229bedbf62b13af5aba6525ad10b62ad38d9ccb5 ]
In case of flow rule creation fail in mlx5_lag_create_port_sel_table(),
instead of previously created rules, the tainted pointer is deleted
deveral times.
Fix this bug by using correct flow rules pointers.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 352899f384d4 ("net/mlx5: Lag, use buckets in hash mode")
Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240604100552.25201-1-amishin@t-argos.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c
index 7d9bbb494d95b..005661248c7e9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/port_sel.c
@@ -88,9 +88,13 @@ static int mlx5_lag_create_port_sel_table(struct mlx5_lag *ldev,
&dest, 1);
if (IS_ERR(lag_definer->rules[idx])) {
err = PTR_ERR(lag_definer->rules[idx]);
- while (i--)
- while (j--)
+ do {
+ while (j--) {
+ idx = i * ldev->buckets + j;
mlx5_del_flow_rules(lag_definer->rules[idx]);
+ }
+ j = ldev->buckets;
+ } while (i--);
goto destroy_fg;
}
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 042/267] net/sched: taprio: always validate TCA_TAPRIO_ATTR_PRIOMAP
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (40 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 041/267] net/mlx5: Fix tainted pointer delete is case of flow rules creation fail Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 043/267] ptp: Fix error message on failed pin verification Greg Kroah-Hartman
` (236 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Noam Rathaus, Eric Dumazet,
Vinicius Costa Gomes, Vladimir Oltean, Jakub Kicinski,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet <edumazet@google.com>
[ Upstream commit f921a58ae20852d188f70842431ce6519c4fdc36 ]
If one TCA_TAPRIO_ATTR_PRIOMAP attribute has been provided,
taprio_parse_mqprio_opt() must validate it, or userspace
can inject arbitrary data to the kernel, the second time
taprio_change() is called.
First call (with valid attributes) sets dev->num_tc
to a non zero value.
Second call (with arbitrary mqprio attributes)
returns early from taprio_parse_mqprio_opt()
and bad things can happen.
Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule")
Reported-by: Noam Rathaus <noamr@ssd-disclosure.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20240604181511.769870-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/sched/sch_taprio.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index a315748a5e531..418d4a846d04a 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -1186,16 +1186,13 @@ static int taprio_parse_mqprio_opt(struct net_device *dev,
{
bool allow_overlapping_txqs = TXTIME_ASSIST_IS_ENABLED(taprio_flags);
- if (!qopt && !dev->num_tc) {
- NL_SET_ERR_MSG(extack, "'mqprio' configuration is necessary");
- return -EINVAL;
- }
-
- /* If num_tc is already set, it means that the user already
- * configured the mqprio part
- */
- if (dev->num_tc)
+ if (!qopt) {
+ if (!dev->num_tc) {
+ NL_SET_ERR_MSG(extack, "'mqprio' configuration is necessary");
+ return -EINVAL;
+ }
return 0;
+ }
/* taprio imposes that traffic classes map 1:n to tx queues */
if (qopt->num_tc > dev->num_tx_queues) {
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 043/267] ptp: Fix error message on failed pin verification
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (41 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 042/267] net/sched: taprio: always validate TCA_TAPRIO_ATTR_PRIOMAP Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 044/267] ice: fix iteration of TLVs in Preserved Fields Area Greg Kroah-Hartman
` (235 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Karol Kolacinski, Richard Cochran,
Jakub Kicinski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Karol Kolacinski <karol.kolacinski@intel.com>
[ Upstream commit 323a359f9b077f382f4483023d096a4d316fd135 ]
On failed verification of PTP clock pin, error message prints channel
number instead of pin index after "pin", which is incorrect.
Fix error message by adding channel number to the message and printing
pin number instead of channel number.
Fixes: 6092315dfdec ("ptp: introduce programmable pins.")
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Link: https://lore.kernel.org/r/20240604120555.16643-1-karol.kolacinski@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/ptp/ptp_chardev.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/ptp/ptp_chardev.c b/drivers/ptp/ptp_chardev.c
index 5a3a4cc0bec82..91cc6ffa0095e 100644
--- a/drivers/ptp/ptp_chardev.c
+++ b/drivers/ptp/ptp_chardev.c
@@ -84,7 +84,8 @@ int ptp_set_pinfunc(struct ptp_clock *ptp, unsigned int pin,
}
if (info->verify(info, pin, func, chan)) {
- pr_err("driver cannot use function %u on pin %u\n", func, chan);
+ pr_err("driver cannot use function %u and channel %u on pin %u\n",
+ func, chan, pin);
return -EOPNOTSUPP;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 044/267] ice: fix iteration of TLVs in Preserved Fields Area
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (42 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 043/267] ptp: Fix error message on failed pin verification Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 045/267] ice: remove af_xdp_zc_qps bitmap Greg Kroah-Hartman
` (234 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Paul Greenwalt, Przemek Kitszel,
Pucha Himasekhar Reddy, Jacob Keller, Jakub Kicinski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jacob Keller <jacob.e.keller@intel.com>
[ Upstream commit 03e4a092be8ce3de7c1baa7ae14e68b64e3ea644 ]
The ice_get_pfa_module_tlv() function iterates over the Type-Length-Value
structures in the Preserved Fields Area (PFA) of the NVM. This is used by
the driver to access data such as the Part Board Assembly identifier.
The function uses simple logic to iterate over the PFA. First, the pointer
to the PFA in the NVM is read. Then the total length of the PFA is read
from the first word.
A pointer to the first TLV is initialized, and a simple loop iterates over
each TLV. The pointer is moved forward through the NVM until it exceeds the
PFA area.
The logic seems sound, but it is missing a key detail. The Preserved
Fields Area length includes one additional final word. This is documented
in the device data sheet as a dummy word which contains 0xFFFF. All NVMs
have this extra word.
If the driver tries to scan for a TLV that is not in the PFA, it will read
past the size of the PFA. It reads and interprets the last dummy word of
the PFA as a TLV with type 0xFFFF. It then reads the word following the PFA
as a length.
The PFA resides within the Shadow RAM portion of the NVM, which is
relatively small. All of its offsets are within a 16-bit size. The PFA
pointer and TLV pointer are stored by the driver as 16-bit values.
In almost all cases, the word following the PFA will be such that
interpreting it as a length will result in 16-bit arithmetic overflow. Once
overflowed, the new next_tlv value is now below the maximum offset of the
PFA. Thus, the driver will continue to iterate the data as TLVs. In the
worst case, the driver hits on a sequence of reads which loop back to
reading the same offsets in an endless loop.
To fix this, we need to correct the loop iteration check to account for
this extra word at the end of the PFA. This alone is sufficient to resolve
the known cases of this issue in the field. However, it is plausible that
an NVM could be misconfigured or have corrupt data which results in the
same kind of overflow. Protect against this by using check_add_overflow
when calculating both the maximum offset of the TLVs, and when calculating
the next_tlv offset at the end of each loop iteration. This ensures that
the driver will not get stuck in an infinite loop when scanning the PFA.
Fixes: e961b679fb0b ("ice: add board identifier info to devlink .info_get")
Co-developed-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-1-e3563aa89b0c@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/intel/ice/ice_nvm.c | 28 ++++++++++++++++++------
1 file changed, 21 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_nvm.c b/drivers/net/ethernet/intel/ice/ice_nvm.c
index f6f52a2480662..2fb43cded572c 100644
--- a/drivers/net/ethernet/intel/ice/ice_nvm.c
+++ b/drivers/net/ethernet/intel/ice/ice_nvm.c
@@ -441,8 +441,7 @@ int
ice_get_pfa_module_tlv(struct ice_hw *hw, u16 *module_tlv, u16 *module_tlv_len,
u16 module_type)
{
- u16 pfa_len, pfa_ptr;
- u16 next_tlv;
+ u16 pfa_len, pfa_ptr, next_tlv, max_tlv;
int status;
status = ice_read_sr_word(hw, ICE_SR_PFA_PTR, &pfa_ptr);
@@ -455,11 +454,23 @@ ice_get_pfa_module_tlv(struct ice_hw *hw, u16 *module_tlv, u16 *module_tlv_len,
ice_debug(hw, ICE_DBG_INIT, "Failed to read PFA length.\n");
return status;
}
+
+ /* The Preserved Fields Area contains a sequence of Type-Length-Value
+ * structures which define its contents. The PFA length includes all
+ * of the TLVs, plus the initial length word itself, *and* one final
+ * word at the end after all of the TLVs.
+ */
+ if (check_add_overflow(pfa_ptr, pfa_len - 1, &max_tlv)) {
+ dev_warn(ice_hw_to_dev(hw), "PFA starts at offset %u. PFA length of %u caused 16-bit arithmetic overflow.\n",
+ pfa_ptr, pfa_len);
+ return -EINVAL;
+ }
+
/* Starting with first TLV after PFA length, iterate through the list
* of TLVs to find the requested one.
*/
next_tlv = pfa_ptr + 1;
- while (next_tlv < pfa_ptr + pfa_len) {
+ while (next_tlv < max_tlv) {
u16 tlv_sub_module_type;
u16 tlv_len;
@@ -483,10 +494,13 @@ ice_get_pfa_module_tlv(struct ice_hw *hw, u16 *module_tlv, u16 *module_tlv_len,
}
return -EINVAL;
}
- /* Check next TLV, i.e. current TLV pointer + length + 2 words
- * (for current TLV's type and length)
- */
- next_tlv = next_tlv + tlv_len + 2;
+
+ if (check_add_overflow(next_tlv, 2, &next_tlv) ||
+ check_add_overflow(next_tlv, tlv_len, &next_tlv)) {
+ dev_warn(ice_hw_to_dev(hw), "TLV of type %u and length 0x%04x caused 16-bit arithmetic overflow. The PFA starts at 0x%04x and has length of 0x%04x\n",
+ tlv_sub_module_type, tlv_len, pfa_ptr, pfa_len);
+ return -EINVAL;
+ }
}
/* Module does not exist */
return -ENOENT;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 045/267] ice: remove af_xdp_zc_qps bitmap
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (43 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 044/267] ice: fix iteration of TLVs in Preserved Fields Area Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 046/267] ice: add flag to distinguish reset from .ndo_bpf in XDP rings config Greg Kroah-Hartman
` (233 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Przemek Kitszel, Larysa Zaremba,
Simon Horman, Chandan Kumar Rout, Jacob Keller, Jakub Kicinski,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Larysa Zaremba <larysa.zaremba@intel.com>
[ Upstream commit adbf5a42341f6ea038d3626cd4437d9f0ad0b2dd ]
Referenced commit has introduced a bitmap to distinguish between ZC and
copy-mode AF_XDP queues, because xsk_get_pool_from_qid() does not do this
for us.
The bitmap would be especially useful when restoring previous state after
rebuild, if only it was not reallocated in the process. This leads to e.g.
xdpsock dying after changing number of queues.
Instead of preserving the bitmap during the rebuild, remove it completely
and distinguish between ZC and copy-mode queues based on the presence of
a device associated with the pool.
Fixes: e102db780e1c ("ice: track AF_XDP ZC enabled queues in bitmap")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-3-e3563aa89b0c@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/intel/ice/ice.h | 32 ++++++++++++++++--------
drivers/net/ethernet/intel/ice/ice_lib.c | 8 ------
drivers/net/ethernet/intel/ice/ice_xsk.c | 13 +++++-----
3 files changed, 27 insertions(+), 26 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 5022b036ca4f9..cf00eaa3e9955 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -407,7 +407,6 @@ struct ice_vsi {
struct ice_tc_cfg tc_cfg;
struct bpf_prog *xdp_prog;
struct ice_tx_ring **xdp_rings; /* XDP ring array */
- unsigned long *af_xdp_zc_qps; /* tracks AF_XDP ZC enabled qps */
u16 num_xdp_txq; /* Used XDP queues */
u8 xdp_mapping_mode; /* ICE_MAP_MODE_[CONTIG|SCATTER] */
@@ -714,6 +713,25 @@ static inline void ice_set_ring_xdp(struct ice_tx_ring *ring)
ring->flags |= ICE_TX_FLAGS_RING_XDP;
}
+/**
+ * ice_get_xp_from_qid - get ZC XSK buffer pool bound to a queue ID
+ * @vsi: pointer to VSI
+ * @qid: index of a queue to look at XSK buff pool presence
+ *
+ * Return: A pointer to xsk_buff_pool structure if there is a buffer pool
+ * attached and configured as zero-copy, NULL otherwise.
+ */
+static inline struct xsk_buff_pool *ice_get_xp_from_qid(struct ice_vsi *vsi,
+ u16 qid)
+{
+ struct xsk_buff_pool *pool = xsk_get_pool_from_qid(vsi->netdev, qid);
+
+ if (!ice_is_xdp_ena_vsi(vsi))
+ return NULL;
+
+ return (pool && pool->dev) ? pool : NULL;
+}
+
/**
* ice_xsk_pool - get XSK buffer pool bound to a ring
* @ring: Rx ring to use
@@ -726,10 +744,7 @@ static inline struct xsk_buff_pool *ice_xsk_pool(struct ice_rx_ring *ring)
struct ice_vsi *vsi = ring->vsi;
u16 qid = ring->q_index;
- if (!ice_is_xdp_ena_vsi(vsi) || !test_bit(qid, vsi->af_xdp_zc_qps))
- return NULL;
-
- return xsk_get_pool_from_qid(vsi->netdev, qid);
+ return ice_get_xp_from_qid(vsi, qid);
}
/**
@@ -754,12 +769,7 @@ static inline void ice_tx_xsk_pool(struct ice_vsi *vsi, u16 qid)
if (!ring)
return;
- if (!ice_is_xdp_ena_vsi(vsi) || !test_bit(qid, vsi->af_xdp_zc_qps)) {
- ring->xsk_pool = NULL;
- return;
- }
-
- ring->xsk_pool = xsk_get_pool_from_qid(vsi->netdev, qid);
+ ring->xsk_pool = ice_get_xp_from_qid(vsi, qid);
}
/**
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index 2004120a58acd..5a7ba0355d338 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -117,14 +117,8 @@ static int ice_vsi_alloc_arrays(struct ice_vsi *vsi)
if (!vsi->q_vectors)
goto err_vectors;
- vsi->af_xdp_zc_qps = bitmap_zalloc(max_t(int, vsi->alloc_txq, vsi->alloc_rxq), GFP_KERNEL);
- if (!vsi->af_xdp_zc_qps)
- goto err_zc_qps;
-
return 0;
-err_zc_qps:
- devm_kfree(dev, vsi->q_vectors);
err_vectors:
devm_kfree(dev, vsi->rxq_map);
err_rxq_map:
@@ -321,8 +315,6 @@ static void ice_vsi_free_arrays(struct ice_vsi *vsi)
dev = ice_pf_to_dev(pf);
- bitmap_free(vsi->af_xdp_zc_qps);
- vsi->af_xdp_zc_qps = NULL;
/* free the ring and vector containers */
devm_kfree(dev, vsi->q_vectors);
vsi->q_vectors = NULL;
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 7bd71660011e4..f53566cb6bfbd 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -289,7 +289,6 @@ static int ice_xsk_pool_disable(struct ice_vsi *vsi, u16 qid)
if (!pool)
return -EINVAL;
- clear_bit(qid, vsi->af_xdp_zc_qps);
xsk_pool_dma_unmap(pool, ICE_RX_DMA_ATTR);
return 0;
@@ -320,8 +319,6 @@ ice_xsk_pool_enable(struct ice_vsi *vsi, struct xsk_buff_pool *pool, u16 qid)
if (err)
return err;
- set_bit(qid, vsi->af_xdp_zc_qps);
-
return 0;
}
@@ -369,11 +366,13 @@ ice_realloc_rx_xdp_bufs(struct ice_rx_ring *rx_ring, bool pool_present)
int ice_realloc_zc_buf(struct ice_vsi *vsi, bool zc)
{
struct ice_rx_ring *rx_ring;
- unsigned long q;
+ uint i;
+
+ ice_for_each_rxq(vsi, i) {
+ rx_ring = vsi->rx_rings[i];
+ if (!rx_ring->xsk_pool)
+ continue;
- for_each_set_bit(q, vsi->af_xdp_zc_qps,
- max_t(int, vsi->alloc_txq, vsi->alloc_rxq)) {
- rx_ring = vsi->rx_rings[q];
if (ice_realloc_rx_xdp_bufs(rx_ring, zc))
return -ENOMEM;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 046/267] ice: add flag to distinguish reset from .ndo_bpf in XDP rings config
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (44 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 045/267] ice: remove af_xdp_zc_qps bitmap Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 047/267] net: wwan: iosm: Fix tainted pointer delete is case of region creation fail Greg Kroah-Hartman
` (232 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Igor Bagnucki, Larysa Zaremba,
Simon Horman, Chandan Kumar Rout, Jacob Keller, Jakub Kicinski,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Larysa Zaremba <larysa.zaremba@intel.com>
[ Upstream commit 744d197162c2070a6045a71e2666ed93a57cc65d ]
Commit 6624e780a577 ("ice: split ice_vsi_setup into smaller functions")
has placed ice_vsi_free_q_vectors() after ice_destroy_xdp_rings() in
the rebuild process. The behaviour of the XDP rings config functions is
context-dependent, so the change of order has led to
ice_destroy_xdp_rings() doing additional work and removing XDP prog, when
it was supposed to be preserved.
Also, dependency on the PF state reset flags creates an additional,
fortunately less common problem:
* PFR is requested e.g. by tx_timeout handler
* .ndo_bpf() is asked to delete the program, calls ice_destroy_xdp_rings(),
but reset flag is set, so rings are destroyed without deleting the
program
* ice_vsi_rebuild tries to delete non-existent XDP rings, because the
program is still on the VSI
* system crashes
With a similar race, when requested to attach a program,
ice_prepare_xdp_rings() can actually skip setting the program in the VSI
and nevertheless report success.
Instead of reverting to the old order of function calls, add an enum
argument to both ice_prepare_xdp_rings() and ice_destroy_xdp_rings() in
order to distinguish between calls from rebuild and .ndo_bpf().
Fixes: efc2214b6047 ("ice: Add support for XDP")
Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com>
Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20240603-net-2024-05-30-intel-net-fixes-v2-4-e3563aa89b0c@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/intel/ice/ice.h | 11 +++++++++--
drivers/net/ethernet/intel/ice/ice_lib.c | 5 +++--
drivers/net/ethernet/intel/ice/ice_main.c | 22 ++++++++++++----------
3 files changed, 24 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index cf00eaa3e9955..c7962f322db2d 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -892,9 +892,16 @@ int ice_down(struct ice_vsi *vsi);
int ice_down_up(struct ice_vsi *vsi);
int ice_vsi_cfg_lan(struct ice_vsi *vsi);
struct ice_vsi *ice_lb_vsi_setup(struct ice_pf *pf, struct ice_port_info *pi);
+
+enum ice_xdp_cfg {
+ ICE_XDP_CFG_FULL, /* Fully apply new config in .ndo_bpf() */
+ ICE_XDP_CFG_PART, /* Save/use part of config in VSI rebuild */
+};
+
int ice_vsi_determine_xdp_res(struct ice_vsi *vsi);
-int ice_prepare_xdp_rings(struct ice_vsi *vsi, struct bpf_prog *prog);
-int ice_destroy_xdp_rings(struct ice_vsi *vsi);
+int ice_prepare_xdp_rings(struct ice_vsi *vsi, struct bpf_prog *prog,
+ enum ice_xdp_cfg cfg_type);
+int ice_destroy_xdp_rings(struct ice_vsi *vsi, enum ice_xdp_cfg cfg_type);
int
ice_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames,
u32 flags);
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
index 5a7ba0355d338..13ca3342a0cea 100644
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -2462,7 +2462,8 @@ ice_vsi_cfg_def(struct ice_vsi *vsi, struct ice_vsi_cfg_params *params)
ret = ice_vsi_determine_xdp_res(vsi);
if (ret)
goto unroll_vector_base;
- ret = ice_prepare_xdp_rings(vsi, vsi->xdp_prog);
+ ret = ice_prepare_xdp_rings(vsi, vsi->xdp_prog,
+ ICE_XDP_CFG_PART);
if (ret)
goto unroll_vector_base;
}
@@ -2613,7 +2614,7 @@ void ice_vsi_decfg(struct ice_vsi *vsi)
/* return value check can be skipped here, it always returns
* 0 if reset is in progress
*/
- ice_destroy_xdp_rings(vsi);
+ ice_destroy_xdp_rings(vsi, ICE_XDP_CFG_PART);
ice_vsi_clear_rings(vsi);
ice_vsi_free_q_vectors(vsi);
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 8ebb6517f6b96..5d71febdcd4dd 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -2657,10 +2657,12 @@ static void ice_vsi_assign_bpf_prog(struct ice_vsi *vsi, struct bpf_prog *prog)
* ice_prepare_xdp_rings - Allocate, configure and setup Tx rings for XDP
* @vsi: VSI to bring up Tx rings used by XDP
* @prog: bpf program that will be assigned to VSI
+ * @cfg_type: create from scratch or restore the existing configuration
*
* Return 0 on success and negative value on error
*/
-int ice_prepare_xdp_rings(struct ice_vsi *vsi, struct bpf_prog *prog)
+int ice_prepare_xdp_rings(struct ice_vsi *vsi, struct bpf_prog *prog,
+ enum ice_xdp_cfg cfg_type)
{
u16 max_txqs[ICE_MAX_TRAFFIC_CLASS] = { 0 };
int xdp_rings_rem = vsi->num_xdp_txq;
@@ -2736,7 +2738,7 @@ int ice_prepare_xdp_rings(struct ice_vsi *vsi, struct bpf_prog *prog)
* taken into account at the end of ice_vsi_rebuild, where
* ice_cfg_vsi_lan is being called
*/
- if (ice_is_reset_in_progress(pf->state))
+ if (cfg_type == ICE_XDP_CFG_PART)
return 0;
/* tell the Tx scheduler that right now we have
@@ -2788,22 +2790,21 @@ int ice_prepare_xdp_rings(struct ice_vsi *vsi, struct bpf_prog *prog)
/**
* ice_destroy_xdp_rings - undo the configuration made by ice_prepare_xdp_rings
* @vsi: VSI to remove XDP rings
+ * @cfg_type: disable XDP permanently or allow it to be restored later
*
* Detach XDP rings from irq vectors, clean up the PF bitmap and free
* resources
*/
-int ice_destroy_xdp_rings(struct ice_vsi *vsi)
+int ice_destroy_xdp_rings(struct ice_vsi *vsi, enum ice_xdp_cfg cfg_type)
{
u16 max_txqs[ICE_MAX_TRAFFIC_CLASS] = { 0 };
struct ice_pf *pf = vsi->back;
int i, v_idx;
/* q_vectors are freed in reset path so there's no point in detaching
- * rings; in case of rebuild being triggered not from reset bits
- * in pf->state won't be set, so additionally check first q_vector
- * against NULL
+ * rings
*/
- if (ice_is_reset_in_progress(pf->state) || !vsi->q_vectors[0])
+ if (cfg_type == ICE_XDP_CFG_PART)
goto free_qmap;
ice_for_each_q_vector(vsi, v_idx) {
@@ -2844,7 +2845,7 @@ int ice_destroy_xdp_rings(struct ice_vsi *vsi)
if (static_key_enabled(&ice_xdp_locking_key))
static_branch_dec(&ice_xdp_locking_key);
- if (ice_is_reset_in_progress(pf->state) || !vsi->q_vectors[0])
+ if (cfg_type == ICE_XDP_CFG_PART)
return 0;
ice_vsi_assign_bpf_prog(vsi, NULL);
@@ -2955,7 +2956,8 @@ ice_xdp_setup_prog(struct ice_vsi *vsi, struct bpf_prog *prog,
if (xdp_ring_err) {
NL_SET_ERR_MSG_MOD(extack, "Not enough Tx resources for XDP");
} else {
- xdp_ring_err = ice_prepare_xdp_rings(vsi, prog);
+ xdp_ring_err = ice_prepare_xdp_rings(vsi, prog,
+ ICE_XDP_CFG_FULL);
if (xdp_ring_err)
NL_SET_ERR_MSG_MOD(extack, "Setting up XDP Tx resources failed");
}
@@ -2966,7 +2968,7 @@ ice_xdp_setup_prog(struct ice_vsi *vsi, struct bpf_prog *prog,
NL_SET_ERR_MSG_MOD(extack, "Setting up XDP Rx resources failed");
} else if (ice_is_xdp_ena_vsi(vsi) && !prog) {
xdp_features_clear_redirect_target(vsi->netdev);
- xdp_ring_err = ice_destroy_xdp_rings(vsi);
+ xdp_ring_err = ice_destroy_xdp_rings(vsi, ICE_XDP_CFG_FULL);
if (xdp_ring_err)
NL_SET_ERR_MSG_MOD(extack, "Freeing XDP Tx resources failed");
/* reallocate Rx queues that were used for zero-copy */
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 047/267] net: wwan: iosm: Fix tainted pointer delete is case of region creation fail
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (45 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 046/267] ice: add flag to distinguish reset from .ndo_bpf in XDP rings config Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 048/267] af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer Greg Kroah-Hartman
` (231 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Aleksandr Mishin, Sergey Ryazanov,
Simon Horman, Paolo Abeni, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aleksandr Mishin <amishin@t-argos.ru>
[ Upstream commit b0c9a26435413b81799047a7be53255640432547 ]
In case of region creation fail in ipc_devlink_create_region(), previously
created regions delete process starts from tainted pointer which actually
holds error code value.
Fix this bug by decreasing region index before delete.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 4dcd183fbd67 ("net: wwan: iosm: devlink registration")
Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240604082500.20769-1-amishin@t-argos.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/wwan/iosm/iosm_ipc_devlink.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wwan/iosm/iosm_ipc_devlink.c b/drivers/net/wwan/iosm/iosm_ipc_devlink.c
index 2fe724d623c06..33c5a46f1b922 100644
--- a/drivers/net/wwan/iosm/iosm_ipc_devlink.c
+++ b/drivers/net/wwan/iosm/iosm_ipc_devlink.c
@@ -210,7 +210,7 @@ static int ipc_devlink_create_region(struct iosm_devlink *devlink)
rc = PTR_ERR(devlink->cd_regions[i]);
dev_err(devlink->dev, "Devlink region fail,err %d", rc);
/* Delete previously created regions */
- for ( ; i >= 0; i--)
+ for (i--; i >= 0; i--)
devlink_region_destroy(devlink->cd_regions[i]);
goto region_create_fail;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 048/267] af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer.
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (46 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 047/267] net: wwan: iosm: Fix tainted pointer delete is case of region creation fail Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 049/267] af_unix: Annodate data-races around sk->sk_state for writers Greg Kroah-Hartman
` (230 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Paolo Abeni,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima <kuniyu@amazon.com>
[ Upstream commit 26bfb8b57063f52b867f9b6c8d1742fcb5bd656c ]
When a SOCK_DGRAM socket connect()s to another socket, the both sockets'
sk->sk_state are changed to TCP_ESTABLISHED so that we can register them
to BPF SOCKMAP.
When the socket disconnects from the peer by connect(AF_UNSPEC), the state
is set back to TCP_CLOSE.
Then, the peer's state is also set to TCP_CLOSE, but the update is done
locklessly and unconditionally.
Let's say socket A connect()ed to B, B connect()ed to C, and A disconnects
from B.
After the first two connect()s, all three sockets' sk->sk_state are
TCP_ESTABLISHED:
$ ss -xa
Netid State Recv-Q Send-Q Local Address:Port Peer Address:PortProcess
u_dgr ESTAB 0 0 @A 641 * 642
u_dgr ESTAB 0 0 @B 642 * 643
u_dgr ESTAB 0 0 @C 643 * 0
And after the disconnect, B's state is TCP_CLOSE even though it's still
connected to C and C's state is TCP_ESTABLISHED.
$ ss -xa
Netid State Recv-Q Send-Q Local Address:Port Peer Address:PortProcess
u_dgr UNCONN 0 0 @A 641 * 0
u_dgr UNCONN 0 0 @B 642 * 643
u_dgr ESTAB 0 0 @C 643 * 0
In this case, we cannot register B to SOCKMAP.
So, when a socket disconnects from the peer, we should not set TCP_CLOSE to
the peer if the peer is connected to yet another socket, and this must be
done under unix_state_lock().
Note that we use WRITE_ONCE() for sk->sk_state as there are many lockless
readers. These data-races will be fixed in the following patches.
Fixes: 83301b5367a9 ("af_unix: Set TCP_ESTABLISHED for datagram sockets too")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/unix/af_unix.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index d01314dc86ecb..348f9e34f6696 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -571,7 +571,6 @@ static void unix_dgram_disconnected(struct sock *sk, struct sock *other)
sk_error_report(other);
}
}
- other->sk_state = TCP_CLOSE;
}
static void unix_sock_destructor(struct sock *sk)
@@ -1434,8 +1433,15 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr,
unix_state_double_unlock(sk, other);
- if (other != old_peer)
+ if (other != old_peer) {
unix_dgram_disconnected(sk, old_peer);
+
+ unix_state_lock(old_peer);
+ if (!unix_peer(old_peer))
+ WRITE_ONCE(old_peer->sk_state, TCP_CLOSE);
+ unix_state_unlock(old_peer);
+ }
+
sock_put(old_peer);
} else {
unix_peer(sk) = other;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 049/267] af_unix: Annodate data-races around sk->sk_state for writers.
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (47 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 048/267] af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 050/267] af_unix: Annotate data-race of sk->sk_state in unix_inq_len() Greg Kroah-Hartman
` (229 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Paolo Abeni,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima <kuniyu@amazon.com>
[ Upstream commit 942238f9735a4a4ebf8274b218d9a910158941d1 ]
sk->sk_state is changed under unix_state_lock(), but it's read locklessly
in many places.
This patch adds WRITE_ONCE() on the writer side.
We will add READ_ONCE() to the lockless readers in the following patches.
Fixes: 83301b5367a9 ("af_unix: Set TCP_ESTABLISHED for datagram sockets too")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/unix/af_unix.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 348f9e34f6696..bd2af62f58605 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -617,7 +617,7 @@ static void unix_release_sock(struct sock *sk, int embrion)
u->path.dentry = NULL;
u->path.mnt = NULL;
state = sk->sk_state;
- sk->sk_state = TCP_CLOSE;
+ WRITE_ONCE(sk->sk_state, TCP_CLOSE);
skpair = unix_peer(sk);
unix_peer(sk) = NULL;
@@ -739,7 +739,8 @@ static int unix_listen(struct socket *sock, int backlog)
if (backlog > sk->sk_max_ack_backlog)
wake_up_interruptible_all(&u->peer_wait);
sk->sk_max_ack_backlog = backlog;
- sk->sk_state = TCP_LISTEN;
+ WRITE_ONCE(sk->sk_state, TCP_LISTEN);
+
/* set credentials so connect can copy them */
init_peercred(sk);
err = 0;
@@ -1411,7 +1412,8 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr,
if (err)
goto out_unlock;
- sk->sk_state = other->sk_state = TCP_ESTABLISHED;
+ WRITE_ONCE(sk->sk_state, TCP_ESTABLISHED);
+ WRITE_ONCE(other->sk_state, TCP_ESTABLISHED);
} else {
/*
* 1003.1g breaking connected state with AF_UNSPEC
@@ -1428,7 +1430,7 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr,
unix_peer(sk) = other;
if (!other)
- sk->sk_state = TCP_CLOSE;
+ WRITE_ONCE(sk->sk_state, TCP_CLOSE);
unix_dgram_peer_wake_disconnect_wakeup(sk, old_peer);
unix_state_double_unlock(sk, other);
@@ -1644,7 +1646,7 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
copy_peercred(sk, other);
sock->state = SS_CONNECTED;
- sk->sk_state = TCP_ESTABLISHED;
+ WRITE_ONCE(sk->sk_state, TCP_ESTABLISHED);
sock_hold(newsk);
smp_mb__after_atomic(); /* sock_hold() does an atomic_inc() */
@@ -2027,7 +2029,7 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
unix_peer(sk) = NULL;
unix_dgram_peer_wake_disconnect_wakeup(sk, other);
- sk->sk_state = TCP_CLOSE;
+ WRITE_ONCE(sk->sk_state, TCP_CLOSE);
unix_state_unlock(sk);
unix_dgram_disconnected(sk, other);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 050/267] af_unix: Annotate data-race of sk->sk_state in unix_inq_len().
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (48 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 049/267] af_unix: Annodate data-races around sk->sk_state for writers Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 051/267] af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll() Greg Kroah-Hartman
` (228 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Paolo Abeni,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima <kuniyu@amazon.com>
[ Upstream commit 3a0f38eb285c8c2eead4b3230c7ac2983707599d ]
ioctl(SIOCINQ) calls unix_inq_len() that checks sk->sk_state first
and returns -EINVAL if it's TCP_LISTEN.
Then, for SOCK_STREAM sockets, unix_inq_len() returns the number of
bytes in recvq.
However, unix_inq_len() does not hold unix_state_lock(), and the
concurrent listen() might change the state after checking sk->sk_state.
If the race occurs, 0 is returned for the listener, instead of -EINVAL,
because the length of skb with embryo is 0.
We could hold unix_state_lock() in unix_inq_len(), but it's overkill
given the result is true for pre-listen() TCP_CLOSE state.
So, let's use READ_ONCE() for sk->sk_state in unix_inq_len().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/unix/af_unix.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index bd2af62f58605..8d0918a112a9d 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2983,7 +2983,7 @@ long unix_inq_len(struct sock *sk)
struct sk_buff *skb;
long amount = 0;
- if (sk->sk_state == TCP_LISTEN)
+ if (READ_ONCE(sk->sk_state) == TCP_LISTEN)
return -EINVAL;
spin_lock(&sk->sk_receive_queue.lock);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 051/267] af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll().
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (49 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 050/267] af_unix: Annotate data-race of sk->sk_state in unix_inq_len() Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 052/267] af_unix: Annotate data-race of sk->sk_state in unix_stream_connect() Greg Kroah-Hartman
` (227 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Paolo Abeni,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima <kuniyu@amazon.com>
[ Upstream commit eb0718fb3e97ad0d6f4529b810103451c90adf94 ]
unix_poll() and unix_dgram_poll() read sk->sk_state locklessly and
calls unix_writable() which also reads sk->sk_state without holding
unix_state_lock().
Let's use READ_ONCE() in unix_poll() and unix_dgram_poll() and pass
it to unix_writable().
While at it, we remove TCP_SYN_SENT check in unix_dgram_poll() as
that state does not exist for AF_UNIX socket since the code was added.
Fixes: 1586a5877db9 ("af_unix: do not report POLLOUT on listeners")
Fixes: 3c73419c09a5 ("af_unix: fix 'poll for write'/ connected DGRAM sockets")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/unix/af_unix.c | 25 ++++++++++++-------------
1 file changed, 12 insertions(+), 13 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 8d0918a112a9d..4a43091c95419 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -531,9 +531,9 @@ static int unix_dgram_peer_wake_me(struct sock *sk, struct sock *other)
return 0;
}
-static int unix_writable(const struct sock *sk)
+static int unix_writable(const struct sock *sk, unsigned char state)
{
- return sk->sk_state != TCP_LISTEN &&
+ return state != TCP_LISTEN &&
(refcount_read(&sk->sk_wmem_alloc) << 2) <= sk->sk_sndbuf;
}
@@ -542,7 +542,7 @@ static void unix_write_space(struct sock *sk)
struct socket_wq *wq;
rcu_read_lock();
- if (unix_writable(sk)) {
+ if (unix_writable(sk, READ_ONCE(sk->sk_state))) {
wq = rcu_dereference(sk->sk_wq);
if (skwq_has_sleeper(wq))
wake_up_interruptible_sync_poll(&wq->wait,
@@ -3095,12 +3095,14 @@ static int unix_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned lon
static __poll_t unix_poll(struct file *file, struct socket *sock, poll_table *wait)
{
struct sock *sk = sock->sk;
+ unsigned char state;
__poll_t mask;
u8 shutdown;
sock_poll_wait(file, sock, wait);
mask = 0;
shutdown = READ_ONCE(sk->sk_shutdown);
+ state = READ_ONCE(sk->sk_state);
/* exceptional events? */
if (READ_ONCE(sk->sk_err))
@@ -3122,14 +3124,14 @@ static __poll_t unix_poll(struct file *file, struct socket *sock, poll_table *wa
/* Connection-based need to check for termination and startup */
if ((sk->sk_type == SOCK_STREAM || sk->sk_type == SOCK_SEQPACKET) &&
- sk->sk_state == TCP_CLOSE)
+ state == TCP_CLOSE)
mask |= EPOLLHUP;
/*
* we set writable also when the other side has shut down the
* connection. This prevents stuck sockets.
*/
- if (unix_writable(sk))
+ if (unix_writable(sk, state))
mask |= EPOLLOUT | EPOLLWRNORM | EPOLLWRBAND;
return mask;
@@ -3140,12 +3142,14 @@ static __poll_t unix_dgram_poll(struct file *file, struct socket *sock,
{
struct sock *sk = sock->sk, *other;
unsigned int writable;
+ unsigned char state;
__poll_t mask;
u8 shutdown;
sock_poll_wait(file, sock, wait);
mask = 0;
shutdown = READ_ONCE(sk->sk_shutdown);
+ state = READ_ONCE(sk->sk_state);
/* exceptional events? */
if (READ_ONCE(sk->sk_err) ||
@@ -3165,19 +3169,14 @@ static __poll_t unix_dgram_poll(struct file *file, struct socket *sock,
mask |= EPOLLIN | EPOLLRDNORM;
/* Connection-based need to check for termination and startup */
- if (sk->sk_type == SOCK_SEQPACKET) {
- if (sk->sk_state == TCP_CLOSE)
- mask |= EPOLLHUP;
- /* connection hasn't started yet? */
- if (sk->sk_state == TCP_SYN_SENT)
- return mask;
- }
+ if (sk->sk_type == SOCK_SEQPACKET && state == TCP_CLOSE)
+ mask |= EPOLLHUP;
/* No write status requested, avoid expensive OUT tests. */
if (!(poll_requested_events(wait) & (EPOLLWRBAND|EPOLLWRNORM|EPOLLOUT)))
return mask;
- writable = unix_writable(sk);
+ writable = unix_writable(sk, state);
if (writable) {
unix_state_lock(sk);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 052/267] af_unix: Annotate data-race of sk->sk_state in unix_stream_connect().
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (50 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 051/267] af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll() Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 053/267] af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg() Greg Kroah-Hartman
` (226 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Paolo Abeni,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima <kuniyu@amazon.com>
[ Upstream commit a9bf9c7dc6a5899c01cb8f6e773a66315a5cd4b7 ]
As small optimisation, unix_stream_connect() prefetches the client's
sk->sk_state without unix_state_lock() and checks if it's TCP_CLOSE.
Later, sk->sk_state is checked again under unix_state_lock().
Let's use READ_ONCE() for the first check and TCP_CLOSE directly for
the second check.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/unix/af_unix.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 4a43091c95419..53d67d540a574 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1491,7 +1491,6 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
struct sk_buff *skb = NULL;
long timeo;
int err;
- int st;
err = unix_validate_addr(sunaddr, addr_len);
if (err)
@@ -1577,9 +1576,7 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
Well, and we have to recheck the state after socket locked.
*/
- st = sk->sk_state;
-
- switch (st) {
+ switch (READ_ONCE(sk->sk_state)) {
case TCP_CLOSE:
/* This is ok... continue with connect */
break;
@@ -1594,7 +1591,7 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
unix_state_lock_nested(sk, U_LOCK_SECOND);
- if (sk->sk_state != st) {
+ if (sk->sk_state != TCP_CLOSE) {
unix_state_unlock(sk);
unix_state_unlock(other);
sock_put(other);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 053/267] af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg().
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (51 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 052/267] af_unix: Annotate data-race of sk->sk_state in unix_stream_connect() Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 054/267] af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb() Greg Kroah-Hartman
` (225 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Paolo Abeni,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima <kuniyu@amazon.com>
[ Upstream commit 8a34d4e8d9742a24f74998f45a6a98edd923319b ]
The following functions read sk->sk_state locklessly and proceed only if
the state is TCP_ESTABLISHED.
* unix_stream_sendmsg
* unix_stream_read_generic
* unix_seqpacket_sendmsg
* unix_seqpacket_recvmsg
Let's use READ_ONCE() there.
Fixes: a05d2ad1c1f3 ("af_unix: Only allow recv on connected seqpacket sockets.")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/unix/af_unix.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 53d67d540a574..dfa013283f478 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2202,7 +2202,7 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
}
if (msg->msg_namelen) {
- err = sk->sk_state == TCP_ESTABLISHED ? -EISCONN : -EOPNOTSUPP;
+ err = READ_ONCE(sk->sk_state) == TCP_ESTABLISHED ? -EISCONN : -EOPNOTSUPP;
goto out_err;
} else {
err = -ENOTCONN;
@@ -2316,7 +2316,7 @@ static int unix_seqpacket_sendmsg(struct socket *sock, struct msghdr *msg,
if (err)
return err;
- if (sk->sk_state != TCP_ESTABLISHED)
+ if (READ_ONCE(sk->sk_state) != TCP_ESTABLISHED)
return -ENOTCONN;
if (msg->msg_namelen)
@@ -2330,7 +2330,7 @@ static int unix_seqpacket_recvmsg(struct socket *sock, struct msghdr *msg,
{
struct sock *sk = sock->sk;
- if (sk->sk_state != TCP_ESTABLISHED)
+ if (READ_ONCE(sk->sk_state) != TCP_ESTABLISHED)
return -ENOTCONN;
return unix_dgram_recvmsg(sock, msg, size, flags);
@@ -2654,7 +2654,7 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state,
size_t size = state->size;
unsigned int last_len;
- if (unlikely(sk->sk_state != TCP_ESTABLISHED)) {
+ if (unlikely(READ_ONCE(sk->sk_state) != TCP_ESTABLISHED)) {
err = -EINVAL;
goto out;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 054/267] af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb().
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (52 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 053/267] af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg() Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 055/267] af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG Greg Kroah-Hartman
` (224 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Paolo Abeni,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima <kuniyu@amazon.com>
[ Upstream commit af4c733b6b1aded4dc808fafece7dfe6e9d2ebb3 ]
unix_stream_read_skb() is called from sk->sk_data_ready() context
where unix_state_lock() is not held.
Let's use READ_ONCE() there.
Fixes: 77462de14a43 ("af_unix: Add read_sock for stream socket types")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/unix/af_unix.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index dfa013283f478..2299a464c602e 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2630,7 +2630,7 @@ static struct sk_buff *manage_oob(struct sk_buff *skb, struct sock *sk,
static int unix_stream_read_skb(struct sock *sk, skb_read_actor_t recv_actor)
{
- if (unlikely(sk->sk_state != TCP_ESTABLISHED))
+ if (unlikely(READ_ONCE(sk->sk_state) != TCP_ESTABLISHED))
return -ENOTCONN;
return unix_read_skb(sk, recv_actor);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 055/267] af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG.
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (53 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 054/267] af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb() Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 056/267] af_unix: Annotate data-races around sk->sk_sndbuf Greg Kroah-Hartman
` (223 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Paolo Abeni,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima <kuniyu@amazon.com>
[ Upstream commit 0aa3be7b3e1f8f997312cc4705f8165e02806f8f ]
While dumping AF_UNIX sockets via UNIX_DIAG, sk->sk_state is read
locklessly.
Let's use READ_ONCE() there.
Note that the result could be inconsistent if the socket is dumped
during the state change. This is common for other SOCK_DIAG and
similar interfaces.
Fixes: c9da99e6475f ("unix_diag: Fixup RQLEN extension report")
Fixes: 2aac7a2cb0d9 ("unix_diag: Pending connections IDs NLA")
Fixes: 45a96b9be6ec ("unix_diag: Dumping all sockets core")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/unix/diag.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/unix/diag.c b/net/unix/diag.c
index 3438b7af09af5..9151c72e742fc 100644
--- a/net/unix/diag.c
+++ b/net/unix/diag.c
@@ -65,7 +65,7 @@ static int sk_diag_dump_icons(struct sock *sk, struct sk_buff *nlskb)
u32 *buf;
int i;
- if (sk->sk_state == TCP_LISTEN) {
+ if (READ_ONCE(sk->sk_state) == TCP_LISTEN) {
spin_lock(&sk->sk_receive_queue.lock);
attr = nla_reserve(nlskb, UNIX_DIAG_ICONS,
@@ -103,7 +103,7 @@ static int sk_diag_show_rqlen(struct sock *sk, struct sk_buff *nlskb)
{
struct unix_diag_rqlen rql;
- if (sk->sk_state == TCP_LISTEN) {
+ if (READ_ONCE(sk->sk_state) == TCP_LISTEN) {
rql.udiag_rqueue = sk->sk_receive_queue.qlen;
rql.udiag_wqueue = sk->sk_max_ack_backlog;
} else {
@@ -136,7 +136,7 @@ static int sk_diag_fill(struct sock *sk, struct sk_buff *skb, struct unix_diag_r
rep = nlmsg_data(nlh);
rep->udiag_family = AF_UNIX;
rep->udiag_type = sk->sk_type;
- rep->udiag_state = sk->sk_state;
+ rep->udiag_state = READ_ONCE(sk->sk_state);
rep->pad = 0;
rep->udiag_ino = sk_ino;
sock_diag_save_cookie(sk, rep->udiag_cookie);
@@ -215,7 +215,7 @@ static int unix_diag_dump(struct sk_buff *skb, struct netlink_callback *cb)
sk_for_each(sk, &net->unx.table.buckets[slot]) {
if (num < s_num)
goto next;
- if (!(req->udiag_states & (1 << sk->sk_state)))
+ if (!(req->udiag_states & (1 << READ_ONCE(sk->sk_state))))
goto next;
if (sk_diag_dump(sk, skb, req, sk_user_ns(skb->sk),
NETLINK_CB(cb->skb).portid,
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 056/267] af_unix: Annotate data-races around sk->sk_sndbuf.
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (54 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 055/267] af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 057/267] af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen Greg Kroah-Hartman
` (222 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Paolo Abeni,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima <kuniyu@amazon.com>
[ Upstream commit b0632e53e0da8054e36bc973f0eec69d30f1b7c6 ]
sk_setsockopt() changes sk->sk_sndbuf under lock_sock(), but it's
not used in af_unix.c.
Let's use READ_ONCE() to read sk->sk_sndbuf in unix_writable(),
unix_dgram_sendmsg(), and unix_stream_sendmsg().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/unix/af_unix.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 2299a464c602e..4640497c29da4 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -534,7 +534,7 @@ static int unix_dgram_peer_wake_me(struct sock *sk, struct sock *other)
static int unix_writable(const struct sock *sk, unsigned char state)
{
return state != TCP_LISTEN &&
- (refcount_read(&sk->sk_wmem_alloc) << 2) <= sk->sk_sndbuf;
+ (refcount_read(&sk->sk_wmem_alloc) << 2) <= READ_ONCE(sk->sk_sndbuf);
}
static void unix_write_space(struct sock *sk)
@@ -1944,7 +1944,7 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
}
err = -EMSGSIZE;
- if (len > sk->sk_sndbuf - 32)
+ if (len > READ_ONCE(sk->sk_sndbuf) - 32)
goto out;
if (len > SKB_MAX_ALLOC) {
@@ -2223,7 +2223,7 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
&err, 0);
} else {
/* Keep two messages in the pipe so it schedules better */
- size = min_t(int, size, (sk->sk_sndbuf >> 1) - 64);
+ size = min_t(int, size, (READ_ONCE(sk->sk_sndbuf) >> 1) - 64);
/* allow fallback to order-0 allocations */
size = min_t(int, size, SKB_MAX_HEAD(0) + UNIX_SKB_FRAGS_SZ);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 057/267] af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen.
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (55 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 056/267] af_unix: Annotate data-races around sk->sk_sndbuf Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 058/267] af_unix: Use unix_recvq_full_lockless() in unix_stream_connect() Greg Kroah-Hartman
` (221 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Paolo Abeni,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima <kuniyu@amazon.com>
[ Upstream commit bd9f2d05731f6a112d0c7391a0d537bfc588dbe6 ]
net->unx.sysctl_max_dgram_qlen is exposed as a sysctl knob and can be
changed concurrently.
Let's use READ_ONCE() in unix_create1().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/unix/af_unix.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 4640497c29da4..2b35c517be718 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -990,7 +990,7 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern,
sk->sk_hash = unix_unbound_hash(sk);
sk->sk_allocation = GFP_KERNEL_ACCOUNT;
sk->sk_write_space = unix_write_space;
- sk->sk_max_ack_backlog = net->unx.sysctl_max_dgram_qlen;
+ sk->sk_max_ack_backlog = READ_ONCE(net->unx.sysctl_max_dgram_qlen);
sk->sk_destruct = unix_sock_destructor;
u = unix_sk(sk);
u->inflight = 0;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 058/267] af_unix: Use unix_recvq_full_lockless() in unix_stream_connect().
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (56 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 057/267] af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 059/267] af_unix: Use skb_queue_empty_lockless() in unix_release_sock() Greg Kroah-Hartman
` (220 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Paolo Abeni,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima <kuniyu@amazon.com>
[ Upstream commit 45d872f0e65593176d880ec148f41ad7c02e40a7 ]
Once sk->sk_state is changed to TCP_LISTEN, it never changes.
unix_accept() takes advantage of this characteristics; it does not
hold the listener's unix_state_lock() and only acquires recvq lock
to pop one skb.
It means unix_state_lock() does not prevent the queue length from
changing in unix_stream_connect().
Thus, we need to use unix_recvq_full_lockless() to avoid data-race.
Now we remove unix_recvq_full() as no one uses it.
Note that we can remove READ_ONCE() for sk->sk_max_ack_backlog in
unix_recvq_full_lockless() because of the following reasons:
(1) For SOCK_DGRAM, it is a written-once field in unix_create1()
(2) For SOCK_STREAM and SOCK_SEQPACKET, it is changed under the
listener's unix_state_lock() in unix_listen(), and we hold
the lock in unix_stream_connect()
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/unix/af_unix.c | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 2b35c517be718..ea68472847cae 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -222,15 +222,9 @@ static inline int unix_may_send(struct sock *sk, struct sock *osk)
return unix_peer(osk) == NULL || unix_our_peer(sk, osk);
}
-static inline int unix_recvq_full(const struct sock *sk)
-{
- return skb_queue_len(&sk->sk_receive_queue) > sk->sk_max_ack_backlog;
-}
-
static inline int unix_recvq_full_lockless(const struct sock *sk)
{
- return skb_queue_len_lockless(&sk->sk_receive_queue) >
- READ_ONCE(sk->sk_max_ack_backlog);
+ return skb_queue_len_lockless(&sk->sk_receive_queue) > sk->sk_max_ack_backlog;
}
struct sock *unix_peer_get(struct sock *s)
@@ -1551,7 +1545,7 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
if (other->sk_shutdown & RCV_SHUTDOWN)
goto out_unlock;
- if (unix_recvq_full(other)) {
+ if (unix_recvq_full_lockless(other)) {
err = -EAGAIN;
if (!timeo)
goto out_unlock;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 059/267] af_unix: Use skb_queue_empty_lockless() in unix_release_sock().
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (57 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 058/267] af_unix: Use unix_recvq_full_lockless() in unix_stream_connect() Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 060/267] af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen() Greg Kroah-Hartman
` (219 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Paolo Abeni,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima <kuniyu@amazon.com>
[ Upstream commit 83690b82d228b3570565ebd0b41873933238b97f ]
If the socket type is SOCK_STREAM or SOCK_SEQPACKET, unix_release_sock()
checks the length of the peer socket's recvq under unix_state_lock().
However, unix_stream_read_generic() calls skb_unlink() after releasing
the lock. Also, for SOCK_SEQPACKET, __skb_try_recv_datagram() unlinks
skb without unix_state_lock().
Thues, unix_state_lock() does not protect qlen.
Let's use skb_queue_empty_lockless() in unix_release_sock().
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/unix/af_unix.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index ea68472847cae..e6395647558af 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -632,7 +632,7 @@ static void unix_release_sock(struct sock *sk, int embrion)
unix_state_lock(skpair);
/* No more writes */
WRITE_ONCE(skpair->sk_shutdown, SHUTDOWN_MASK);
- if (!skb_queue_empty(&sk->sk_receive_queue) || embrion)
+ if (!skb_queue_empty_lockless(&sk->sk_receive_queue) || embrion)
WRITE_ONCE(skpair->sk_err, ECONNRESET);
unix_state_unlock(skpair);
skpair->sk_state_change(skpair);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 060/267] af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen().
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (58 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 059/267] af_unix: Use skb_queue_empty_lockless() in unix_release_sock() Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 061/267] af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill() Greg Kroah-Hartman
` (218 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Paolo Abeni,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima <kuniyu@amazon.com>
[ Upstream commit 5d915e584d8408211d4567c22685aae8820bfc55 ]
We can dump the socket queue length via UNIX_DIAG by specifying
UDIAG_SHOW_RQLEN.
If sk->sk_state is TCP_LISTEN, we return the recv queue length,
but here we do not hold recvq lock.
Let's use skb_queue_len_lockless() in sk_diag_show_rqlen().
Fixes: c9da99e6475f ("unix_diag: Fixup RQLEN extension report")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/unix/diag.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/unix/diag.c b/net/unix/diag.c
index 9151c72e742fc..fc56244214c30 100644
--- a/net/unix/diag.c
+++ b/net/unix/diag.c
@@ -104,7 +104,7 @@ static int sk_diag_show_rqlen(struct sock *sk, struct sk_buff *nlskb)
struct unix_diag_rqlen rql;
if (READ_ONCE(sk->sk_state) == TCP_LISTEN) {
- rql.udiag_rqueue = sk->sk_receive_queue.qlen;
+ rql.udiag_rqueue = skb_queue_len_lockless(&sk->sk_receive_queue);
rql.udiag_wqueue = sk->sk_max_ack_backlog;
} else {
rql.udiag_rqueue = (u32) unix_inq_len(sk);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 061/267] af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill().
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (59 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 060/267] af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen() Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 062/267] ipv6: fix possible race in __fib6_drop_pcpu_from() Greg Kroah-Hartman
` (217 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Paolo Abeni,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima <kuniyu@amazon.com>
[ Upstream commit efaf24e30ec39ebbea9112227485805a48b0ceb1 ]
While dumping sockets via UNIX_DIAG, we do not hold unix_state_lock().
Let's use READ_ONCE() to read sk->sk_shutdown.
Fixes: e4e541a84863 ("sock-diag: Report shutdown for inet and unix sockets (v2)")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/unix/diag.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/unix/diag.c b/net/unix/diag.c
index fc56244214c30..1de7500b41b61 100644
--- a/net/unix/diag.c
+++ b/net/unix/diag.c
@@ -165,7 +165,7 @@ static int sk_diag_fill(struct sock *sk, struct sk_buff *skb, struct unix_diag_r
sock_diag_put_meminfo(sk, skb, UNIX_DIAG_MEMINFO))
goto out_nlmsg_trim;
- if (nla_put_u8(skb, UNIX_DIAG_SHUTDOWN, sk->sk_shutdown))
+ if (nla_put_u8(skb, UNIX_DIAG_SHUTDOWN, READ_ONCE(sk->sk_shutdown)))
goto out_nlmsg_trim;
if ((req->udiag_show & UDIAG_SHOW_UID) &&
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 062/267] ipv6: fix possible race in __fib6_drop_pcpu_from()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (60 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 061/267] af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill() Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 063/267] net: ethtool: fix the error condition in ethtool_get_phy_stats_ethtool() Greg Kroah-Hartman
` (216 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Eric Dumazet, Martin KaFai Lau,
Paolo Abeni, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet <edumazet@google.com>
[ Upstream commit b01e1c030770ff3b4fe37fc7cc6bca03f594133f ]
syzbot found a race in __fib6_drop_pcpu_from() [1]
If compiler reads more than once (*ppcpu_rt),
second read could read NULL, if another cpu clears
the value in rt6_get_pcpu_route().
Add a READ_ONCE() to prevent this race.
Also add rcu_read_lock()/rcu_read_unlock() because
we rely on RCU protection while dereferencing pcpu_rt.
[1]
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000012: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000090-0x0000000000000097]
CPU: 0 PID: 7543 Comm: kworker/u8:17 Not tainted 6.10.0-rc1-syzkaller-00013-g2bfcfd584ff5 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
Workqueue: netns cleanup_net
RIP: 0010:__fib6_drop_pcpu_from.part.0+0x10a/0x370 net/ipv6/ip6_fib.c:984
Code: f8 48 c1 e8 03 80 3c 28 00 0f 85 16 02 00 00 4d 8b 3f 4d 85 ff 74 31 e8 74 a7 fa f7 49 8d bf 90 00 00 00 48 89 f8 48 c1 e8 03 <80> 3c 28 00 0f 85 1e 02 00 00 49 8b 87 90 00 00 00 48 8b 0c 24 48
RSP: 0018:ffffc900040df070 EFLAGS: 00010206
RAX: 0000000000000012 RBX: 0000000000000001 RCX: ffffffff89932e16
RDX: ffff888049dd1e00 RSI: ffffffff89932d7c RDI: 0000000000000091
RBP: dffffc0000000000 R08: 0000000000000005 R09: 0000000000000007
R10: 0000000000000001 R11: 0000000000000006 R12: ffff88807fa080b8
R13: fffffbfff1a9a07d R14: ffffed100ff41022 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff8880b9200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b32c26000 CR3: 000000005d56e000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
__fib6_drop_pcpu_from net/ipv6/ip6_fib.c:966 [inline]
fib6_drop_pcpu_from net/ipv6/ip6_fib.c:1027 [inline]
fib6_purge_rt+0x7f2/0x9f0 net/ipv6/ip6_fib.c:1038
fib6_del_route net/ipv6/ip6_fib.c:1998 [inline]
fib6_del+0xa70/0x17b0 net/ipv6/ip6_fib.c:2043
fib6_clean_node+0x426/0x5b0 net/ipv6/ip6_fib.c:2205
fib6_walk_continue+0x44f/0x8d0 net/ipv6/ip6_fib.c:2127
fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2175
fib6_clean_tree+0xd7/0x120 net/ipv6/ip6_fib.c:2255
__fib6_clean_all+0x100/0x2d0 net/ipv6/ip6_fib.c:2271
rt6_sync_down_dev net/ipv6/route.c:4906 [inline]
rt6_disable_ip+0x7ed/0xa00 net/ipv6/route.c:4911
addrconf_ifdown.isra.0+0x117/0x1b40 net/ipv6/addrconf.c:3855
addrconf_notify+0x223/0x19e0 net/ipv6/addrconf.c:3778
notifier_call_chain+0xb9/0x410 kernel/notifier.c:93
call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:1992
call_netdevice_notifiers_extack net/core/dev.c:2030 [inline]
call_netdevice_notifiers net/core/dev.c:2044 [inline]
dev_close_many+0x333/0x6a0 net/core/dev.c:1585
unregister_netdevice_many_notify+0x46d/0x19f0 net/core/dev.c:11193
unregister_netdevice_many net/core/dev.c:11276 [inline]
default_device_exit_batch+0x85b/0xae0 net/core/dev.c:11759
ops_exit_list+0x128/0x180 net/core/net_namespace.c:178
cleanup_net+0x5b7/0xbf0 net/core/net_namespace.c:640
process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231
process_scheduled_works kernel/workqueue.c:3312 [inline]
worker_thread+0x6c8/0xf70 kernel/workqueue.c:3393
kthread+0x2c1/0x3a0 kernel/kthread.c:389
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
Fixes: d52d3997f843 ("ipv6: Create percpu rt6_info")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/r/20240604193549.981839-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/ipv6/ip6_fib.c | 6 +++++-
net/ipv6/route.c | 1 +
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 8184076a3924e..4356806b52bd5 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -961,6 +961,7 @@ static void __fib6_drop_pcpu_from(struct fib6_nh *fib6_nh,
if (!fib6_nh->rt6i_pcpu)
return;
+ rcu_read_lock();
/* release the reference to this fib entry from
* all of its cached pcpu routes
*/
@@ -969,7 +970,9 @@ static void __fib6_drop_pcpu_from(struct fib6_nh *fib6_nh,
struct rt6_info *pcpu_rt;
ppcpu_rt = per_cpu_ptr(fib6_nh->rt6i_pcpu, cpu);
- pcpu_rt = *ppcpu_rt;
+
+ /* Paired with xchg() in rt6_get_pcpu_route() */
+ pcpu_rt = READ_ONCE(*ppcpu_rt);
/* only dropping the 'from' reference if the cached route
* is using 'match'. The cached pcpu_rt->from only changes
@@ -983,6 +986,7 @@ static void __fib6_drop_pcpu_from(struct fib6_nh *fib6_nh,
fib6_info_release(from);
}
}
+ rcu_read_unlock();
}
struct fib6_nh_pcpu_arg {
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index c48eaa7c23401..0a37f04177337 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1399,6 +1399,7 @@ static struct rt6_info *rt6_get_pcpu_route(const struct fib6_result *res)
struct rt6_info *prev, **p;
p = this_cpu_ptr(res->nh->rt6i_pcpu);
+ /* Paired with READ_ONCE() in __fib6_drop_pcpu_from() */
prev = xchg(p, NULL);
if (prev) {
dst_dev_put(&prev->dst);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 063/267] net: ethtool: fix the error condition in ethtool_get_phy_stats_ethtool()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (61 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 062/267] ipv6: fix possible race in __fib6_drop_pcpu_from() Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 064/267] ksmbd: use rwsem instead of rwlock for lease break Greg Kroah-Hartman
` (215 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Su Hui, Przemek Kitszel,
Hariprasad Kelam, Paolo Abeni, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Su Hui <suhui@nfschina.com>
[ Upstream commit 0dcc53abf58d572d34c5313de85f607cd33fc691 ]
Clang static checker (scan-build) warning:
net/ethtool/ioctl.c:line 2233, column 2
Called function pointer is null (null dereference).
Return '-EOPNOTSUPP' when 'ops->get_ethtool_phy_stats' is NULL to fix
this typo error.
Fixes: 201ed315f967 ("net/ethtool/ioctl: split ethtool_get_phy_stats into multiple helpers")
Signed-off-by: Su Hui <suhui@nfschina.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Hariprasad Kelam <hkelam@marvell.com>
Link: https://lore.kernel.org/r/20240605034742.921751-1-suhui@nfschina.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/ethtool/ioctl.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index 0b0ce4f81c017..7cb23bcf8ef7a 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -2134,7 +2134,7 @@ static int ethtool_get_phy_stats_ethtool(struct net_device *dev,
const struct ethtool_ops *ops = dev->ethtool_ops;
int n_stats, ret;
- if (!ops || !ops->get_sset_count || ops->get_ethtool_phy_stats)
+ if (!ops || !ops->get_sset_count || !ops->get_ethtool_phy_stats)
return -EOPNOTSUPP;
n_stats = ops->get_sset_count(dev, ETH_SS_PHY_STATS);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 064/267] ksmbd: use rwsem instead of rwlock for lease break
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (62 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 063/267] net: ethtool: fix the error condition in ethtool_get_phy_stats_ethtool() Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 065/267] firmware: qcom_scm: disable clocks if qcom_scm_bw_enable() fails Greg Kroah-Hartman
` (214 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Namjae Jeon, Steve French,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namjae Jeon <linkinjeon@kernel.org>
[ Upstream commit d1c189c6cb8b0fb7b5ee549237d27889c40c2f8b ]
lease break wait for lease break acknowledgment.
rwsem is more suitable than unlock while traversing the list for parent
lease break in ->m_op_list.
Cc: stable@vger.kernel.org
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/smb/server/oplock.c | 30 +++++++++++-------------------
fs/smb/server/smb2pdu.c | 4 ++--
fs/smb/server/smb_common.c | 4 ++--
fs/smb/server/vfs_cache.c | 28 ++++++++++++++--------------
fs/smb/server/vfs_cache.h | 2 +-
5 files changed, 30 insertions(+), 38 deletions(-)
diff --git a/fs/smb/server/oplock.c b/fs/smb/server/oplock.c
index 7d17a14378e33..a8f52c4ebbdad 100644
--- a/fs/smb/server/oplock.c
+++ b/fs/smb/server/oplock.c
@@ -207,9 +207,9 @@ static void opinfo_add(struct oplock_info *opinfo)
{
struct ksmbd_inode *ci = opinfo->o_fp->f_ci;
- write_lock(&ci->m_lock);
+ down_write(&ci->m_lock);
list_add_rcu(&opinfo->op_entry, &ci->m_op_list);
- write_unlock(&ci->m_lock);
+ up_write(&ci->m_lock);
}
static void opinfo_del(struct oplock_info *opinfo)
@@ -221,9 +221,9 @@ static void opinfo_del(struct oplock_info *opinfo)
lease_del_list(opinfo);
write_unlock(&lease_list_lock);
}
- write_lock(&ci->m_lock);
+ down_write(&ci->m_lock);
list_del_rcu(&opinfo->op_entry);
- write_unlock(&ci->m_lock);
+ up_write(&ci->m_lock);
}
static unsigned long opinfo_count(struct ksmbd_file *fp)
@@ -526,21 +526,18 @@ static struct oplock_info *same_client_has_lease(struct ksmbd_inode *ci,
* Compare lease key and client_guid to know request from same owner
* of same client
*/
- read_lock(&ci->m_lock);
+ down_read(&ci->m_lock);
list_for_each_entry(opinfo, &ci->m_op_list, op_entry) {
if (!opinfo->is_lease || !opinfo->conn)
continue;
- read_unlock(&ci->m_lock);
lease = opinfo->o_lease;
ret = compare_guid_key(opinfo, client_guid, lctx->lease_key);
if (ret) {
m_opinfo = opinfo;
/* skip upgrading lease about breaking lease */
- if (atomic_read(&opinfo->breaking_cnt)) {
- read_lock(&ci->m_lock);
+ if (atomic_read(&opinfo->breaking_cnt))
continue;
- }
/* upgrading lease */
if ((atomic_read(&ci->op_count) +
@@ -570,9 +567,8 @@ static struct oplock_info *same_client_has_lease(struct ksmbd_inode *ci,
lease_none_upgrade(opinfo, lctx->req_state);
}
}
- read_lock(&ci->m_lock);
}
- read_unlock(&ci->m_lock);
+ up_read(&ci->m_lock);
return m_opinfo;
}
@@ -1119,7 +1115,7 @@ void smb_send_parent_lease_break_noti(struct ksmbd_file *fp,
if (!p_ci)
return;
- read_lock(&p_ci->m_lock);
+ down_read(&p_ci->m_lock);
list_for_each_entry(opinfo, &p_ci->m_op_list, op_entry) {
if (opinfo->conn == NULL || !opinfo->is_lease)
continue;
@@ -1137,13 +1133,11 @@ void smb_send_parent_lease_break_noti(struct ksmbd_file *fp,
continue;
}
- read_unlock(&p_ci->m_lock);
oplock_break(opinfo, SMB2_OPLOCK_LEVEL_NONE);
opinfo_conn_put(opinfo);
- read_lock(&p_ci->m_lock);
}
}
- read_unlock(&p_ci->m_lock);
+ up_read(&p_ci->m_lock);
ksmbd_inode_put(p_ci);
}
@@ -1164,7 +1158,7 @@ void smb_lazy_parent_lease_break_close(struct ksmbd_file *fp)
if (!p_ci)
return;
- read_lock(&p_ci->m_lock);
+ down_read(&p_ci->m_lock);
list_for_each_entry(opinfo, &p_ci->m_op_list, op_entry) {
if (opinfo->conn == NULL || !opinfo->is_lease)
continue;
@@ -1178,13 +1172,11 @@ void smb_lazy_parent_lease_break_close(struct ksmbd_file *fp)
atomic_dec(&opinfo->conn->r_count);
continue;
}
- read_unlock(&p_ci->m_lock);
oplock_break(opinfo, SMB2_OPLOCK_LEVEL_NONE);
opinfo_conn_put(opinfo);
- read_lock(&p_ci->m_lock);
}
}
- read_unlock(&p_ci->m_lock);
+ up_read(&p_ci->m_lock);
ksmbd_inode_put(p_ci);
}
diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c
index 6a15c5d64f415..8df93c9d4ee41 100644
--- a/fs/smb/server/smb2pdu.c
+++ b/fs/smb/server/smb2pdu.c
@@ -3376,9 +3376,9 @@ int smb2_open(struct ksmbd_work *work)
* after daccess, saccess, attrib_only, and stream are
* initialized.
*/
- write_lock(&fp->f_ci->m_lock);
+ down_write(&fp->f_ci->m_lock);
list_add(&fp->node, &fp->f_ci->m_fp_list);
- write_unlock(&fp->f_ci->m_lock);
+ up_write(&fp->f_ci->m_lock);
/* Check delete pending among previous fp before oplock break */
if (ksmbd_inode_pending_delete(fp)) {
diff --git a/fs/smb/server/smb_common.c b/fs/smb/server/smb_common.c
index fcaf373cc0080..474dadf6b7b8b 100644
--- a/fs/smb/server/smb_common.c
+++ b/fs/smb/server/smb_common.c
@@ -646,7 +646,7 @@ int ksmbd_smb_check_shared_mode(struct file *filp, struct ksmbd_file *curr_fp)
* Lookup fp in master fp list, and check desired access and
* shared mode between previous open and current open.
*/
- read_lock(&curr_fp->f_ci->m_lock);
+ down_read(&curr_fp->f_ci->m_lock);
list_for_each_entry(prev_fp, &curr_fp->f_ci->m_fp_list, node) {
if (file_inode(filp) != file_inode(prev_fp->filp))
continue;
@@ -722,7 +722,7 @@ int ksmbd_smb_check_shared_mode(struct file *filp, struct ksmbd_file *curr_fp)
break;
}
}
- read_unlock(&curr_fp->f_ci->m_lock);
+ up_read(&curr_fp->f_ci->m_lock);
return rc;
}
diff --git a/fs/smb/server/vfs_cache.c b/fs/smb/server/vfs_cache.c
index 030f70700036c..6cb599cd287ee 100644
--- a/fs/smb/server/vfs_cache.c
+++ b/fs/smb/server/vfs_cache.c
@@ -165,7 +165,7 @@ static int ksmbd_inode_init(struct ksmbd_inode *ci, struct ksmbd_file *fp)
ci->m_fattr = 0;
INIT_LIST_HEAD(&ci->m_fp_list);
INIT_LIST_HEAD(&ci->m_op_list);
- rwlock_init(&ci->m_lock);
+ init_rwsem(&ci->m_lock);
ci->m_de = fp->filp->f_path.dentry;
return 0;
}
@@ -261,14 +261,14 @@ static void __ksmbd_inode_close(struct ksmbd_file *fp)
}
if (atomic_dec_and_test(&ci->m_count)) {
- write_lock(&ci->m_lock);
+ down_write(&ci->m_lock);
if (ci->m_flags & (S_DEL_ON_CLS | S_DEL_PENDING)) {
ci->m_flags &= ~(S_DEL_ON_CLS | S_DEL_PENDING);
- write_unlock(&ci->m_lock);
+ up_write(&ci->m_lock);
ksmbd_vfs_unlink(filp);
- write_lock(&ci->m_lock);
+ down_write(&ci->m_lock);
}
- write_unlock(&ci->m_lock);
+ up_write(&ci->m_lock);
ksmbd_inode_free(ci);
}
@@ -289,9 +289,9 @@ static void __ksmbd_remove_fd(struct ksmbd_file_table *ft, struct ksmbd_file *fp
if (!has_file_id(fp->volatile_id))
return;
- write_lock(&fp->f_ci->m_lock);
+ down_write(&fp->f_ci->m_lock);
list_del_init(&fp->node);
- write_unlock(&fp->f_ci->m_lock);
+ up_write(&fp->f_ci->m_lock);
write_lock(&ft->lock);
idr_remove(ft->idr, fp->volatile_id);
@@ -523,17 +523,17 @@ struct ksmbd_file *ksmbd_lookup_fd_inode(struct dentry *dentry)
if (!ci)
return NULL;
- read_lock(&ci->m_lock);
+ down_read(&ci->m_lock);
list_for_each_entry(lfp, &ci->m_fp_list, node) {
if (inode == file_inode(lfp->filp)) {
atomic_dec(&ci->m_count);
lfp = ksmbd_fp_get(lfp);
- read_unlock(&ci->m_lock);
+ up_read(&ci->m_lock);
return lfp;
}
}
atomic_dec(&ci->m_count);
- read_unlock(&ci->m_lock);
+ up_read(&ci->m_lock);
return NULL;
}
@@ -705,13 +705,13 @@ static bool session_fd_check(struct ksmbd_tree_connect *tcon,
conn = fp->conn;
ci = fp->f_ci;
- write_lock(&ci->m_lock);
+ down_write(&ci->m_lock);
list_for_each_entry_rcu(op, &ci->m_op_list, op_entry) {
if (op->conn != conn)
continue;
op->conn = NULL;
}
- write_unlock(&ci->m_lock);
+ up_write(&ci->m_lock);
fp->conn = NULL;
fp->tcon = NULL;
@@ -801,13 +801,13 @@ int ksmbd_reopen_durable_fd(struct ksmbd_work *work, struct ksmbd_file *fp)
fp->tcon = work->tcon;
ci = fp->f_ci;
- write_lock(&ci->m_lock);
+ down_write(&ci->m_lock);
list_for_each_entry_rcu(op, &ci->m_op_list, op_entry) {
if (op->conn)
continue;
op->conn = fp->conn;
}
- write_unlock(&ci->m_lock);
+ up_write(&ci->m_lock);
__open_id(&work->sess->file_table, fp, OPEN_ID_TYPE_VOLATILE_ID);
if (!has_file_id(fp->volatile_id)) {
diff --git a/fs/smb/server/vfs_cache.h b/fs/smb/server/vfs_cache.h
index ed44fb4e18e79..5a225e7055f19 100644
--- a/fs/smb/server/vfs_cache.h
+++ b/fs/smb/server/vfs_cache.h
@@ -47,7 +47,7 @@ struct stream {
};
struct ksmbd_inode {
- rwlock_t m_lock;
+ struct rw_semaphore m_lock;
atomic_t m_count;
atomic_t op_count;
/* opinfo count for streams */
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 065/267] firmware: qcom_scm: disable clocks if qcom_scm_bw_enable() fails
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (63 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 064/267] ksmbd: use rwsem instead of rwlock for lease break Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 066/267] memory-failure: use a folio in me_huge_page() Greg Kroah-Hartman
` (213 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Gabor Juhos, Mukesh Ojha,
Bjorn Andersson, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gabor Juhos <j4g8y7@gmail.com>
[ Upstream commit 0c50b7fcf2773b4853e83fc15aba1a196ba95966 ]
There are several functions which are calling qcom_scm_bw_enable()
then returns immediately if the call fails and leaves the clocks
enabled.
Change the code of these functions to disable clocks when the
qcom_scm_bw_enable() call fails. This also fixes a possible dma
buffer leak in the qcom_scm_pas_init_image() function.
Compile tested only due to lack of hardware with interconnect
support.
Cc: stable@vger.kernel.org
Fixes: 65b7ebda5028 ("firmware: qcom_scm: Add bw voting support to the SCM interface")
Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com>
Link: https://lore.kernel.org/r/20240304-qcom-scm-disable-clk-v1-1-b36e51577ca1@gmail.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/firmware/qcom_scm.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index ff7c155239e31..7af59985f1c1f 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -498,13 +498,14 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
ret = qcom_scm_bw_enable();
if (ret)
- return ret;
+ goto disable_clk;
desc.args[1] = mdata_phys;
ret = qcom_scm_call(__scm->dev, &desc, &res);
-
qcom_scm_bw_disable();
+
+disable_clk:
qcom_scm_clk_disable();
out:
@@ -566,10 +567,12 @@ int qcom_scm_pas_mem_setup(u32 peripheral, phys_addr_t addr, phys_addr_t size)
ret = qcom_scm_bw_enable();
if (ret)
- return ret;
+ goto disable_clk;
ret = qcom_scm_call(__scm->dev, &desc, &res);
qcom_scm_bw_disable();
+
+disable_clk:
qcom_scm_clk_disable();
return ret ? : res.result[0];
@@ -601,10 +604,12 @@ int qcom_scm_pas_auth_and_reset(u32 peripheral)
ret = qcom_scm_bw_enable();
if (ret)
- return ret;
+ goto disable_clk;
ret = qcom_scm_call(__scm->dev, &desc, &res);
qcom_scm_bw_disable();
+
+disable_clk:
qcom_scm_clk_disable();
return ret ? : res.result[0];
@@ -635,11 +640,12 @@ int qcom_scm_pas_shutdown(u32 peripheral)
ret = qcom_scm_bw_enable();
if (ret)
- return ret;
+ goto disable_clk;
ret = qcom_scm_call(__scm->dev, &desc, &res);
-
qcom_scm_bw_disable();
+
+disable_clk:
qcom_scm_clk_disable();
return ret ? : res.result[0];
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 066/267] memory-failure: use a folio in me_huge_page()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (64 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 065/267] firmware: qcom_scm: disable clocks if qcom_scm_bw_enable() fails Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 067/267] mm/memory-failure: fix handling of dissolved but not taken off from buddy pages Greg Kroah-Hartman
` (212 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Matthew Wilcox (Oracle),
Naoya Horiguchi, Andrew Morton, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthew Wilcox (Oracle) <willy@infradead.org>
[ Upstream commit b6fd410c32f1a66a52a42d6aae1ab7b011b74547 ]
This function was already explicitly calling compound_head();
unfortunately the compiler can't know that and elide the redundant calls
to compound_head() buried in page_mapping(), unlock_page(), etc. Switch
to using a folio, which does let us elide these calls.
Link: https://lkml.kernel.org/r/20231117161447.2461643-5-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: 8cf360b9d6a8 ("mm/memory-failure: fix handling of dissolved but not taken off from buddy pages")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
mm/memory-failure.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 5378edad9df8f..9c27ec0a27a30 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1193,25 +1193,25 @@ static int me_swapcache_clean(struct page_state *ps, struct page *p)
*/
static int me_huge_page(struct page_state *ps, struct page *p)
{
+ struct folio *folio = page_folio(p);
int res;
- struct page *hpage = compound_head(p);
struct address_space *mapping;
bool extra_pins = false;
- mapping = page_mapping(hpage);
+ mapping = folio_mapping(folio);
if (mapping) {
- res = truncate_error_page(hpage, page_to_pfn(p), mapping);
+ res = truncate_error_page(&folio->page, page_to_pfn(p), mapping);
/* The page is kept in page cache. */
extra_pins = true;
- unlock_page(hpage);
+ folio_unlock(folio);
} else {
- unlock_page(hpage);
+ folio_unlock(folio);
/*
* migration entry prevents later access on error hugepage,
* so we can free and dissolve it into buddy to save healthy
* subpages.
*/
- put_page(hpage);
+ folio_put(folio);
if (__page_handle_poison(p) >= 0) {
page_ref_inc(p);
res = MF_RECOVERED;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 067/267] mm/memory-failure: fix handling of dissolved but not taken off from buddy pages
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (65 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 066/267] memory-failure: use a folio in me_huge_page() Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 068/267] selftests/mm: conform test to TAP format output Greg Kroah-Hartman
` (211 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Miaohe Lin, Naoya Horiguchi,
Andrew Morton, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miaohe Lin <linmiaohe@huawei.com>
[ Upstream commit 8cf360b9d6a840700e06864236a01a883b34bbad ]
When I did memory failure tests recently, below panic occurs:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00
flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff)
raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000
raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000
page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page))
------------[ cut here ]------------
kernel BUG at include/linux/page-flags.h:1009!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
RIP: 0010:__del_page_from_free_list+0x151/0x180
RSP: 0018:ffffa49c90437998 EFLAGS: 00000046
RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0
RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69
R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80
R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009
FS: 00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0
Call Trace:
<TASK>
__rmqueue_pcplist+0x23b/0x520
get_page_from_freelist+0x26b/0xe40
__alloc_pages_noprof+0x113/0x1120
__folio_alloc_noprof+0x11/0xb0
alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130
__alloc_fresh_hugetlb_folio+0xe7/0x140
alloc_pool_huge_folio+0x68/0x100
set_max_huge_pages+0x13d/0x340
hugetlb_sysctl_handler_common+0xe8/0x110
proc_sys_call_handler+0x194/0x280
vfs_write+0x387/0x550
ksys_write+0x64/0xe0
do_syscall_64+0xc2/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff916114887
RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887
RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003
RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0
R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004
R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00
</TASK>
Modules linked in: mce_inject hwpoison_inject
---[ end trace 0000000000000000 ]---
And before the panic, there had an warning about bad page state:
BUG: Bad page state in process page-types pfn:8cee00
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00
flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff)
page_type: 0xffffff7f(buddy)
raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000
raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000
page dumped because: nonzero mapcount
Modules linked in: mce_inject hwpoison_inject
CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty #22
Call Trace:
<TASK>
dump_stack_lvl+0x83/0xa0
bad_page+0x63/0xf0
free_unref_page+0x36e/0x5c0
unpoison_memory+0x50b/0x630
simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
debugfs_attr_write+0x42/0x60
full_proxy_write+0x5b/0x80
vfs_write+0xcd/0x550
ksys_write+0x64/0xe0
do_syscall_64+0xc2/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f189a514887
RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887
RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003
RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8
R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040
</TASK>
The root cause should be the below race:
memory_failure
try_memory_failure_hugetlb
me_huge_page
__page_handle_poison
dissolve_free_hugetlb_folio
drain_all_pages -- Buddy page can be isolated e.g. for compaction.
take_page_off_buddy -- Failed as page is not in the buddy list.
-- Page can be putback into buddy after compaction.
page_ref_inc -- Leads to buddy page with refcnt = 1.
Then unpoison_memory() can unpoison the page and send the buddy page back
into buddy list again leading to the above bad page state warning. And
bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy
page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to
allocate this page.
Fix this issue by only treating __page_handle_poison() as successful when
it returns 1.
Link: https://lkml.kernel.org/r/20240523071217.1696196-1-linmiaohe@huawei.com
Fixes: ceaf8fbea79a ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
mm/memory-failure.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 9c27ec0a27a30..c7e2b609184b6 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1212,7 +1212,7 @@ static int me_huge_page(struct page_state *ps, struct page *p)
* subpages.
*/
folio_put(folio);
- if (__page_handle_poison(p) >= 0) {
+ if (__page_handle_poison(p) > 0) {
page_ref_inc(p);
res = MF_RECOVERED;
} else {
@@ -2082,7 +2082,7 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb
*/
if (res == 0) {
folio_unlock(folio);
- if (__page_handle_poison(p) >= 0) {
+ if (__page_handle_poison(p) > 0) {
page_ref_inc(p);
res = MF_RECOVERED;
} else {
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 068/267] selftests/mm: conform test to TAP format output
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (66 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 067/267] mm/memory-failure: fix handling of dissolved but not taken off from buddy pages Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 069/267] selftests/mm: log a consistent test name for check_compaction Greg Kroah-Hartman
` (210 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Muhammad Usama Anjum, Shuah Khan,
Andrew Morton, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Muhammad Usama Anjum <usama.anjum@collabora.com>
[ Upstream commit 9a21701edc41465de56f97914741bfb7bfc2517d ]
Conform the layout, informational and status messages to TAP. No
functional change is intended other than the layout of output messages.
Link: https://lkml.kernel.org/r/20240101083614.1076768-1-usama.anjum@collabora.com
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: d4202e66a4b1 ("selftests/mm: compaction_test: fix bogus test success on Aarch64")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
tools/testing/selftests/mm/compaction_test.c | 91 ++++++++++----------
1 file changed, 44 insertions(+), 47 deletions(-)
diff --git a/tools/testing/selftests/mm/compaction_test.c b/tools/testing/selftests/mm/compaction_test.c
index 55dec92e1e58c..f81931c1f8386 100644
--- a/tools/testing/selftests/mm/compaction_test.c
+++ b/tools/testing/selftests/mm/compaction_test.c
@@ -33,7 +33,7 @@ int read_memory_info(unsigned long *memfree, unsigned long *hugepagesize)
FILE *cmdfile = popen(cmd, "r");
if (!(fgets(buffer, sizeof(buffer), cmdfile))) {
- perror("Failed to read meminfo\n");
+ ksft_print_msg("Failed to read meminfo: %s\n", strerror(errno));
return -1;
}
@@ -44,7 +44,7 @@ int read_memory_info(unsigned long *memfree, unsigned long *hugepagesize)
cmdfile = popen(cmd, "r");
if (!(fgets(buffer, sizeof(buffer), cmdfile))) {
- perror("Failed to read meminfo\n");
+ ksft_print_msg("Failed to read meminfo: %s\n", strerror(errno));
return -1;
}
@@ -62,14 +62,14 @@ int prereq(void)
fd = open("/proc/sys/vm/compact_unevictable_allowed",
O_RDONLY | O_NONBLOCK);
if (fd < 0) {
- perror("Failed to open\n"
- "/proc/sys/vm/compact_unevictable_allowed\n");
+ ksft_print_msg("Failed to open /proc/sys/vm/compact_unevictable_allowed: %s\n",
+ strerror(errno));
return -1;
}
if (read(fd, &allowed, sizeof(char)) != sizeof(char)) {
- perror("Failed to read from\n"
- "/proc/sys/vm/compact_unevictable_allowed\n");
+ ksft_print_msg("Failed to read from /proc/sys/vm/compact_unevictable_allowed: %s\n",
+ strerror(errno));
close(fd);
return -1;
}
@@ -78,12 +78,13 @@ int prereq(void)
if (allowed == '1')
return 0;
+ ksft_print_msg("Compaction isn't allowed\n");
return -1;
}
int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
{
- int fd;
+ int fd, ret = -1;
int compaction_index = 0;
char initial_nr_hugepages[10] = {0};
char nr_hugepages[10] = {0};
@@ -94,12 +95,14 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
fd = open("/proc/sys/vm/nr_hugepages", O_RDWR | O_NONBLOCK);
if (fd < 0) {
- perror("Failed to open /proc/sys/vm/nr_hugepages");
+ ksft_test_result_fail("Failed to open /proc/sys/vm/nr_hugepages: %s\n",
+ strerror(errno));
return -1;
}
if (read(fd, initial_nr_hugepages, sizeof(initial_nr_hugepages)) <= 0) {
- perror("Failed to read from /proc/sys/vm/nr_hugepages");
+ ksft_test_result_fail("Failed to read from /proc/sys/vm/nr_hugepages: %s\n",
+ strerror(errno));
goto close_fd;
}
@@ -107,7 +110,8 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
/* Start with the initial condition of 0 huge pages*/
if (write(fd, "0", sizeof(char)) != sizeof(char)) {
- perror("Failed to write 0 to /proc/sys/vm/nr_hugepages\n");
+ ksft_test_result_fail("Failed to write 0 to /proc/sys/vm/nr_hugepages: %s\n",
+ strerror(errno));
goto close_fd;
}
@@ -116,14 +120,16 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
/* Request a large number of huge pages. The Kernel will allocate
as much as it can */
if (write(fd, "100000", (6*sizeof(char))) != (6*sizeof(char))) {
- perror("Failed to write 100000 to /proc/sys/vm/nr_hugepages\n");
+ ksft_test_result_fail("Failed to write 100000 to /proc/sys/vm/nr_hugepages: %s\n",
+ strerror(errno));
goto close_fd;
}
lseek(fd, 0, SEEK_SET);
if (read(fd, nr_hugepages, sizeof(nr_hugepages)) <= 0) {
- perror("Failed to re-read from /proc/sys/vm/nr_hugepages\n");
+ ksft_test_result_fail("Failed to re-read from /proc/sys/vm/nr_hugepages: %s\n",
+ strerror(errno));
goto close_fd;
}
@@ -131,67 +137,58 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
huge pages */
compaction_index = mem_free/(atoi(nr_hugepages) * hugepage_size);
- if (compaction_index > 3) {
- printf("No of huge pages allocated = %d\n",
- (atoi(nr_hugepages)));
- fprintf(stderr, "ERROR: Less that 1/%d of memory is available\n"
- "as huge pages\n", compaction_index);
- goto close_fd;
- }
-
- printf("No of huge pages allocated = %d\n",
- (atoi(nr_hugepages)));
-
lseek(fd, 0, SEEK_SET);
if (write(fd, initial_nr_hugepages, strlen(initial_nr_hugepages))
!= strlen(initial_nr_hugepages)) {
- perror("Failed to write value to /proc/sys/vm/nr_hugepages\n");
+ ksft_test_result_fail("Failed to write value to /proc/sys/vm/nr_hugepages: %s\n",
+ strerror(errno));
goto close_fd;
}
- close(fd);
- return 0;
+ if (compaction_index > 3) {
+ ksft_print_msg("ERROR: Less that 1/%d of memory is available\n"
+ "as huge pages\n", compaction_index);
+ ksft_test_result_fail("No of huge pages allocated = %d\n", (atoi(nr_hugepages)));
+ goto close_fd;
+ }
+
+ ksft_test_result_pass("Memory compaction succeeded. No of huge pages allocated = %d\n",
+ (atoi(nr_hugepages)));
+ ret = 0;
close_fd:
close(fd);
- printf("Not OK. Compaction test failed.");
- return -1;
+ return ret;
}
int main(int argc, char **argv)
{
struct rlimit lim;
- struct map_list *list, *entry;
+ struct map_list *list = NULL, *entry;
size_t page_size, i;
void *map = NULL;
unsigned long mem_free = 0;
unsigned long hugepage_size = 0;
long mem_fragmentable_MB = 0;
- if (prereq() != 0) {
- printf("Either the sysctl compact_unevictable_allowed is not\n"
- "set to 1 or couldn't read the proc file.\n"
- "Skipping the test\n");
- return KSFT_SKIP;
- }
+ ksft_print_header();
+
+ if (prereq() != 0)
+ return ksft_exit_pass();
+
+ ksft_set_plan(1);
lim.rlim_cur = RLIM_INFINITY;
lim.rlim_max = RLIM_INFINITY;
- if (setrlimit(RLIMIT_MEMLOCK, &lim)) {
- perror("Failed to set rlimit:\n");
- return -1;
- }
+ if (setrlimit(RLIMIT_MEMLOCK, &lim))
+ ksft_exit_fail_msg("Failed to set rlimit: %s\n", strerror(errno));
page_size = getpagesize();
- list = NULL;
-
- if (read_memory_info(&mem_free, &hugepage_size) != 0) {
- printf("ERROR: Cannot read meminfo\n");
- return -1;
- }
+ if (read_memory_info(&mem_free, &hugepage_size) != 0)
+ ksft_exit_fail_msg("Failed to get meminfo\n");
mem_fragmentable_MB = mem_free * 0.8 / 1024;
@@ -227,7 +224,7 @@ int main(int argc, char **argv)
}
if (check_compaction(mem_free, hugepage_size) == 0)
- return 0;
+ return ksft_exit_pass();
- return -1;
+ return ksft_exit_fail();
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 069/267] selftests/mm: log a consistent test name for check_compaction
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (67 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 068/267] selftests/mm: conform test to TAP format output Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 13:05 ` Mark Brown
2024-06-19 12:53 ` [PATCH 6.6 070/267] selftests/mm: compaction_test: fix bogus test success on Aarch64 Greg Kroah-Hartman
` (209 subsequent siblings)
278 siblings, 1 reply; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Mark Brown, Muhammad Usama Anjum,
Ryan Roberts, Shuah Khan, Andrew Morton, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mark Brown <broonie@kernel.org>
[ Upstream commit f3b7568c49420d2dcd251032c9ca1e069ec8a6c9 ]
Every test result report in the compaction test prints a distinct log
messae, and some of the reports print a name that varies at runtime. This
causes problems for automation since a lot of automation software uses the
printed string as the name of the test, if the name varies from run to run
and from pass to fail then the automation software can't identify that a
test changed result or that the same tests are being run.
Refactor the logging to use a consistent name when printing the result of
the test, printing the existing messages as diagnostic information instead
so they are still available for people trying to interpret the results.
Link: https://lkml.kernel.org/r/20240209-kselftest-mm-cleanup-v1-2-a3c0386496b5@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: d4202e66a4b1 ("selftests/mm: compaction_test: fix bogus test success on Aarch64")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
tools/testing/selftests/mm/compaction_test.c | 35 +++++++++++---------
1 file changed, 19 insertions(+), 16 deletions(-)
diff --git a/tools/testing/selftests/mm/compaction_test.c b/tools/testing/selftests/mm/compaction_test.c
index f81931c1f8386..6aa6460b854ea 100644
--- a/tools/testing/selftests/mm/compaction_test.c
+++ b/tools/testing/selftests/mm/compaction_test.c
@@ -95,14 +95,15 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
fd = open("/proc/sys/vm/nr_hugepages", O_RDWR | O_NONBLOCK);
if (fd < 0) {
- ksft_test_result_fail("Failed to open /proc/sys/vm/nr_hugepages: %s\n",
- strerror(errno));
- return -1;
+ ksft_print_msg("Failed to open /proc/sys/vm/nr_hugepages: %s\n",
+ strerror(errno));
+ ret = -1;
+ goto out;
}
if (read(fd, initial_nr_hugepages, sizeof(initial_nr_hugepages)) <= 0) {
- ksft_test_result_fail("Failed to read from /proc/sys/vm/nr_hugepages: %s\n",
- strerror(errno));
+ ksft_print_msg("Failed to read from /proc/sys/vm/nr_hugepages: %s\n",
+ strerror(errno));
goto close_fd;
}
@@ -110,8 +111,8 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
/* Start with the initial condition of 0 huge pages*/
if (write(fd, "0", sizeof(char)) != sizeof(char)) {
- ksft_test_result_fail("Failed to write 0 to /proc/sys/vm/nr_hugepages: %s\n",
- strerror(errno));
+ ksft_print_msg("Failed to write 0 to /proc/sys/vm/nr_hugepages: %s\n",
+ strerror(errno));
goto close_fd;
}
@@ -120,16 +121,16 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
/* Request a large number of huge pages. The Kernel will allocate
as much as it can */
if (write(fd, "100000", (6*sizeof(char))) != (6*sizeof(char))) {
- ksft_test_result_fail("Failed to write 100000 to /proc/sys/vm/nr_hugepages: %s\n",
- strerror(errno));
+ ksft_print_msg("Failed to write 100000 to /proc/sys/vm/nr_hugepages: %s\n",
+ strerror(errno));
goto close_fd;
}
lseek(fd, 0, SEEK_SET);
if (read(fd, nr_hugepages, sizeof(nr_hugepages)) <= 0) {
- ksft_test_result_fail("Failed to re-read from /proc/sys/vm/nr_hugepages: %s\n",
- strerror(errno));
+ ksft_print_msg("Failed to re-read from /proc/sys/vm/nr_hugepages: %s\n",
+ strerror(errno));
goto close_fd;
}
@@ -141,24 +142,26 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
if (write(fd, initial_nr_hugepages, strlen(initial_nr_hugepages))
!= strlen(initial_nr_hugepages)) {
- ksft_test_result_fail("Failed to write value to /proc/sys/vm/nr_hugepages: %s\n",
- strerror(errno));
+ ksft_print_msg("Failed to write value to /proc/sys/vm/nr_hugepages: %s\n",
+ strerror(errno));
goto close_fd;
}
+ ksft_print_msg("Number of huge pages allocated = %d\n",
+ atoi(nr_hugepages));
+
if (compaction_index > 3) {
ksft_print_msg("ERROR: Less that 1/%d of memory is available\n"
"as huge pages\n", compaction_index);
- ksft_test_result_fail("No of huge pages allocated = %d\n", (atoi(nr_hugepages)));
goto close_fd;
}
- ksft_test_result_pass("Memory compaction succeeded. No of huge pages allocated = %d\n",
- (atoi(nr_hugepages)));
ret = 0;
close_fd:
close(fd);
+ out:
+ ksft_test_result(ret == 0, "check_compaction\n");
return ret;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* Re: [PATCH 6.6 069/267] selftests/mm: log a consistent test name for check_compaction
2024-06-19 12:53 ` [PATCH 6.6 069/267] selftests/mm: log a consistent test name for check_compaction Greg Kroah-Hartman
@ 2024-06-19 13:05 ` Mark Brown
0 siblings, 0 replies; 282+ messages in thread
From: Mark Brown @ 2024-06-19 13:05 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: stable, patches, Muhammad Usama Anjum, Ryan Roberts, Shuah Khan,
Andrew Morton, Sasha Levin
[-- Attachment #1: Type: text/plain, Size: 1054 bytes --]
On Wed, Jun 19, 2024 at 02:53:40PM +0200, Greg Kroah-Hartman wrote:
> 6.6-stable review patch. If anyone has any objections, please let me know.
>
> ------------------
>
> From: Mark Brown <broonie@kernel.org>
>
> [ Upstream commit f3b7568c49420d2dcd251032c9ca1e069ec8a6c9 ]
>
> Every test result report in the compaction test prints a distinct log
> messae, and some of the reports print a name that varies at runtime. This
> causes problems for automation since a lot of automation software uses the
> printed string as the name of the test, if the name varies from run to run
> and from pass to fail then the automation software can't identify that a
> test changed result or that the same tests are being run.
>
> Refactor the logging to use a consistent name when printing the result of
> the test, printing the existing messages as diagnostic information instead
> so they are still available for people trying to interpret the results.
This will change the test names for these tests which might disrupt
people's CI.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 282+ messages in thread
* [PATCH 6.6 070/267] selftests/mm: compaction_test: fix bogus test success on Aarch64
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (68 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 069/267] selftests/mm: log a consistent test name for check_compaction Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 071/267] irqchip/riscv-intc: Allow large non-standard interrupt number Greg Kroah-Hartman
` (208 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Dev Jain, Anshuman Khandual,
Shuah Khan, Sri Jayaramappa, Andrew Morton, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dev Jain <dev.jain@arm.com>
[ Upstream commit d4202e66a4b1fe6968f17f9f09bbc30d08f028a1 ]
Patch series "Fixes for compaction_test", v2.
The compaction_test memory selftest introduces fragmentation in memory
and then tries to allocate as many hugepages as possible. This series
addresses some problems.
On Aarch64, if nr_hugepages == 0, then the test trivially succeeds since
compaction_index becomes 0, which is less than 3, due to no division by
zero exception being raised. We fix that by checking for division by
zero.
Secondly, correctly set the number of hugepages to zero before trying
to set a large number of them.
Now, consider a situation in which, at the start of the test, a non-zero
number of hugepages have been already set (while running the entire
selftests/mm suite, or manually by the admin). The test operates on 80%
of memory to avoid OOM-killer invocation, and because some memory is
already blocked by hugepages, it would increase the chance of OOM-killing.
Also, since mem_free used in check_compaction() is the value before we
set nr_hugepages to zero, the chance that the compaction_index will
be small is very high if the preset nr_hugepages was high, leading to a
bogus test success.
This patch (of 3):
Currently, if at runtime we are not able to allocate a huge page, the test
will trivially pass on Aarch64 due to no exception being raised on
division by zero while computing compaction_index. Fix that by checking
for nr_hugepages == 0. Anyways, in general, avoid a division by zero by
exiting the program beforehand. While at it, fix a typo, and handle the
case where the number of hugepages may overflow an integer.
Link: https://lkml.kernel.org/r/20240521074358.675031-1-dev.jain@arm.com
Link: https://lkml.kernel.org/r/20240521074358.675031-2-dev.jain@arm.com
Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory")
Signed-off-by: Dev Jain <dev.jain@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Sri Jayaramappa <sjayaram@akamai.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
tools/testing/selftests/mm/compaction_test.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/mm/compaction_test.c b/tools/testing/selftests/mm/compaction_test.c
index 6aa6460b854ea..309b3750e57e1 100644
--- a/tools/testing/selftests/mm/compaction_test.c
+++ b/tools/testing/selftests/mm/compaction_test.c
@@ -82,12 +82,13 @@ int prereq(void)
return -1;
}
-int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
+int check_compaction(unsigned long mem_free, unsigned long hugepage_size)
{
+ unsigned long nr_hugepages_ul;
int fd, ret = -1;
int compaction_index = 0;
- char initial_nr_hugepages[10] = {0};
- char nr_hugepages[10] = {0};
+ char initial_nr_hugepages[20] = {0};
+ char nr_hugepages[20] = {0};
/* We want to test with 80% of available memory. Else, OOM killer comes
in to play */
@@ -136,7 +137,12 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
/* We should have been able to request at least 1/3 rd of the memory in
huge pages */
- compaction_index = mem_free/(atoi(nr_hugepages) * hugepage_size);
+ nr_hugepages_ul = strtoul(nr_hugepages, NULL, 10);
+ if (!nr_hugepages_ul) {
+ ksft_print_msg("ERROR: No memory is available as huge pages\n");
+ goto close_fd;
+ }
+ compaction_index = mem_free/(nr_hugepages_ul * hugepage_size);
lseek(fd, 0, SEEK_SET);
@@ -147,11 +153,11 @@ int check_compaction(unsigned long mem_free, unsigned int hugepage_size)
goto close_fd;
}
- ksft_print_msg("Number of huge pages allocated = %d\n",
- atoi(nr_hugepages));
+ ksft_print_msg("Number of huge pages allocated = %lu\n",
+ nr_hugepages_ul);
if (compaction_index > 3) {
- ksft_print_msg("ERROR: Less that 1/%d of memory is available\n"
+ ksft_print_msg("ERROR: Less than 1/%d of memory is available\n"
"as huge pages\n", compaction_index);
goto close_fd;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 071/267] irqchip/riscv-intc: Allow large non-standard interrupt number
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (69 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 070/267] selftests/mm: compaction_test: fix bogus test success on Aarch64 Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 072/267] irqchip/riscv-intc: Introduce Andes hart-level interrupt controller Greg Kroah-Hartman
` (207 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Yu Chien Peter Lin, Thomas Gleixner,
Randolph, Anup Patel, Atish Patra, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yu Chien Peter Lin <peterlin@andestech.com>
[ Upstream commit 96303bcb401c21dc1426d8d9bb1fc74aae5c02a9 ]
Currently, the implementation of the RISC-V INTC driver uses the
interrupt cause as the hardware interrupt number, with a maximum of
64 interrupts. However, the platform can expand the interrupt number
further for custom local interrupts.
To fully utilize the available local interrupt sources, switch
to using irq_domain_create_tree() that creates the radix tree
map, add global variables (riscv_intc_nr_irqs, riscv_intc_custom_base
and riscv_intc_custom_nr_irqs) to determine the valid range of local
interrupt number (hwirq).
Signed-off-by: Yu Chien Peter Lin <peterlin@andestech.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Randolph <randolph@andestech.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240222083946.3977135-3-peterlin@andestech.com
Stable-dep-of: 0110c4b11047 ("irqchip/riscv-intc: Prevent memory leak when riscv_intc_init_common() fails")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/irqchip/irq-riscv-intc.c | 26 +++++++++++++++++++-------
1 file changed, 19 insertions(+), 7 deletions(-)
diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c
index e8d01b14ccdde..684875c397280 100644
--- a/drivers/irqchip/irq-riscv-intc.c
+++ b/drivers/irqchip/irq-riscv-intc.c
@@ -19,15 +19,16 @@
#include <linux/smp.h>
static struct irq_domain *intc_domain;
+static unsigned int riscv_intc_nr_irqs __ro_after_init = BITS_PER_LONG;
+static unsigned int riscv_intc_custom_base __ro_after_init = BITS_PER_LONG;
+static unsigned int riscv_intc_custom_nr_irqs __ro_after_init;
static asmlinkage void riscv_intc_irq(struct pt_regs *regs)
{
unsigned long cause = regs->cause & ~CAUSE_IRQ_FLAG;
- if (unlikely(cause >= BITS_PER_LONG))
- panic("unexpected interrupt cause");
-
- generic_handle_domain_irq(intc_domain, cause);
+ if (generic_handle_domain_irq(intc_domain, cause))
+ pr_warn_ratelimited("Failed to handle interrupt (cause: %ld)\n", cause);
}
/*
@@ -93,6 +94,14 @@ static int riscv_intc_domain_alloc(struct irq_domain *domain,
if (ret)
return ret;
+ /*
+ * Only allow hwirq for which we have corresponding standard or
+ * custom interrupt enable register.
+ */
+ if ((hwirq >= riscv_intc_nr_irqs && hwirq < riscv_intc_custom_base) ||
+ (hwirq >= riscv_intc_custom_base + riscv_intc_custom_nr_irqs))
+ return -EINVAL;
+
for (i = 0; i < nr_irqs; i++) {
ret = riscv_intc_domain_map(domain, virq + i, hwirq + i);
if (ret)
@@ -117,8 +126,7 @@ static int __init riscv_intc_init_common(struct fwnode_handle *fn)
{
int rc;
- intc_domain = irq_domain_create_linear(fn, BITS_PER_LONG,
- &riscv_intc_domain_ops, NULL);
+ intc_domain = irq_domain_create_tree(fn, &riscv_intc_domain_ops, NULL);
if (!intc_domain) {
pr_err("unable to add IRQ domain\n");
return -ENXIO;
@@ -132,7 +140,11 @@ static int __init riscv_intc_init_common(struct fwnode_handle *fn)
riscv_set_intc_hwnode_fn(riscv_intc_hwnode);
- pr_info("%d local interrupts mapped\n", BITS_PER_LONG);
+ pr_info("%d local interrupts mapped\n", riscv_intc_nr_irqs);
+ if (riscv_intc_custom_nr_irqs) {
+ pr_info("%d custom local interrupts mapped\n",
+ riscv_intc_custom_nr_irqs);
+ }
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 072/267] irqchip/riscv-intc: Introduce Andes hart-level interrupt controller
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (70 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 071/267] irqchip/riscv-intc: Allow large non-standard interrupt number Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 073/267] irqchip/riscv-intc: Prevent memory leak when riscv_intc_init_common() fails Greg Kroah-Hartman
` (206 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Yu Chien Peter Lin, Thomas Gleixner,
Randolph, Anup Patel, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yu Chien Peter Lin <peterlin@andestech.com>
[ Upstream commit f4cc33e78ba8624a79ba8dea98ce5c85aa9ca33c ]
Add support for the Andes hart-level interrupt controller. This
controller provides interrupt mask/unmask functions to access the
custom register (SLIE) where the non-standard S-mode local interrupt
enable bits are located. The base of custom interrupt number is set
to 256.
To share the riscv_intc_domain_map() with the generic RISC-V INTC and
ACPI, add a chip parameter to riscv_intc_init_common(), so it can be
passed to the irq_domain_set_info() as a private data.
Andes hart-level interrupt controller requires the "andestech,cpu-intc"
compatible string to be present in interrupt-controller of cpu node to
enable the use of custom local interrupt source.
e.g.,
cpu0: cpu@0 {
compatible = "andestech,ax45mp", "riscv";
...
cpu0-intc: interrupt-controller {
#interrupt-cells = <0x01>;
compatible = "andestech,cpu-intc", "riscv,cpu-intc";
interrupt-controller;
};
};
Signed-off-by: Yu Chien Peter Lin <peterlin@andestech.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Randolph <randolph@andestech.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240222083946.3977135-4-peterlin@andestech.com
Stable-dep-of: 0110c4b11047 ("irqchip/riscv-intc: Prevent memory leak when riscv_intc_init_common() fails")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/irqchip/irq-riscv-intc.c | 58 ++++++++++++++++++++++++++++----
include/linux/soc/andes/irq.h | 18 ++++++++++
2 files changed, 69 insertions(+), 7 deletions(-)
create mode 100644 include/linux/soc/andes/irq.h
diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c
index 684875c397280..0cd6b48a5dbf9 100644
--- a/drivers/irqchip/irq-riscv-intc.c
+++ b/drivers/irqchip/irq-riscv-intc.c
@@ -17,6 +17,7 @@
#include <linux/module.h>
#include <linux/of.h>
#include <linux/smp.h>
+#include <linux/soc/andes/irq.h>
static struct irq_domain *intc_domain;
static unsigned int riscv_intc_nr_irqs __ro_after_init = BITS_PER_LONG;
@@ -48,6 +49,31 @@ static void riscv_intc_irq_unmask(struct irq_data *d)
csr_set(CSR_IE, BIT(d->hwirq));
}
+static void andes_intc_irq_mask(struct irq_data *d)
+{
+ /*
+ * Andes specific S-mode local interrupt causes (hwirq)
+ * are defined as (256 + n) and controlled by n-th bit
+ * of SLIE.
+ */
+ unsigned int mask = BIT(d->hwirq % BITS_PER_LONG);
+
+ if (d->hwirq < ANDES_SLI_CAUSE_BASE)
+ csr_clear(CSR_IE, mask);
+ else
+ csr_clear(ANDES_CSR_SLIE, mask);
+}
+
+static void andes_intc_irq_unmask(struct irq_data *d)
+{
+ unsigned int mask = BIT(d->hwirq % BITS_PER_LONG);
+
+ if (d->hwirq < ANDES_SLI_CAUSE_BASE)
+ csr_set(CSR_IE, mask);
+ else
+ csr_set(ANDES_CSR_SLIE, mask);
+}
+
static void riscv_intc_irq_eoi(struct irq_data *d)
{
/*
@@ -71,12 +97,21 @@ static struct irq_chip riscv_intc_chip = {
.irq_eoi = riscv_intc_irq_eoi,
};
+static struct irq_chip andes_intc_chip = {
+ .name = "RISC-V INTC",
+ .irq_mask = andes_intc_irq_mask,
+ .irq_unmask = andes_intc_irq_unmask,
+ .irq_eoi = riscv_intc_irq_eoi,
+};
+
static int riscv_intc_domain_map(struct irq_domain *d, unsigned int irq,
irq_hw_number_t hwirq)
{
+ struct irq_chip *chip = d->host_data;
+
irq_set_percpu_devid(irq);
- irq_domain_set_info(d, irq, hwirq, &riscv_intc_chip, d->host_data,
- handle_percpu_devid_irq, NULL, NULL);
+ irq_domain_set_info(d, irq, hwirq, chip, NULL, handle_percpu_devid_irq,
+ NULL, NULL);
return 0;
}
@@ -122,11 +157,12 @@ static struct fwnode_handle *riscv_intc_hwnode(void)
return intc_domain->fwnode;
}
-static int __init riscv_intc_init_common(struct fwnode_handle *fn)
+static int __init riscv_intc_init_common(struct fwnode_handle *fn,
+ struct irq_chip *chip)
{
int rc;
- intc_domain = irq_domain_create_tree(fn, &riscv_intc_domain_ops, NULL);
+ intc_domain = irq_domain_create_tree(fn, &riscv_intc_domain_ops, chip);
if (!intc_domain) {
pr_err("unable to add IRQ domain\n");
return -ENXIO;
@@ -152,8 +188,9 @@ static int __init riscv_intc_init_common(struct fwnode_handle *fn)
static int __init riscv_intc_init(struct device_node *node,
struct device_node *parent)
{
- int rc;
+ struct irq_chip *chip = &riscv_intc_chip;
unsigned long hartid;
+ int rc;
rc = riscv_of_parent_hartid(node, &hartid);
if (rc < 0) {
@@ -178,10 +215,17 @@ static int __init riscv_intc_init(struct device_node *node,
return 0;
}
- return riscv_intc_init_common(of_node_to_fwnode(node));
+ if (of_device_is_compatible(node, "andestech,cpu-intc")) {
+ riscv_intc_custom_base = ANDES_SLI_CAUSE_BASE;
+ riscv_intc_custom_nr_irqs = ANDES_RV_IRQ_LAST;
+ chip = &andes_intc_chip;
+ }
+
+ return riscv_intc_init_common(of_node_to_fwnode(node), chip);
}
IRQCHIP_DECLARE(riscv, "riscv,cpu-intc", riscv_intc_init);
+IRQCHIP_DECLARE(andes, "andestech,cpu-intc", riscv_intc_init);
#ifdef CONFIG_ACPI
@@ -208,7 +252,7 @@ static int __init riscv_intc_acpi_init(union acpi_subtable_headers *header,
return -ENOMEM;
}
- return riscv_intc_init_common(fn);
+ return riscv_intc_init_common(fn, &riscv_intc_chip);
}
IRQCHIP_ACPI_DECLARE(riscv_intc, ACPI_MADT_TYPE_RINTC, NULL,
diff --git a/include/linux/soc/andes/irq.h b/include/linux/soc/andes/irq.h
new file mode 100644
index 0000000000000..edc3182d6e661
--- /dev/null
+++ b/include/linux/soc/andes/irq.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2023 Andes Technology Corporation
+ */
+#ifndef __ANDES_IRQ_H
+#define __ANDES_IRQ_H
+
+/* Andes PMU irq number */
+#define ANDES_RV_IRQ_PMOVI 18
+#define ANDES_RV_IRQ_LAST ANDES_RV_IRQ_PMOVI
+#define ANDES_SLI_CAUSE_BASE 256
+
+/* Andes PMU related registers */
+#define ANDES_CSR_SLIE 0x9c4
+#define ANDES_CSR_SLIP 0x9c5
+#define ANDES_CSR_SCOUNTEROF 0x9d4
+
+#endif /* __ANDES_IRQ_H */
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 073/267] irqchip/riscv-intc: Prevent memory leak when riscv_intc_init_common() fails
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (71 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 072/267] irqchip/riscv-intc: Introduce Andes hart-level interrupt controller Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 074/267] eventfs: Update all the eventfs_inodes from the events descriptor Greg Kroah-Hartman
` (205 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Sunil V L, Thomas Gleixner,
Anup Patel, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sunil V L <sunilvl@ventanamicro.com>
[ Upstream commit 0110c4b110477bb1f19b0d02361846be7ab08300 ]
When riscv_intc_init_common() fails, the firmware node allocated is not
freed. Add the missing free().
Fixes: 7023b9d83f03 ("irqchip/riscv-intc: Add ACPI support")
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240527081113.616189-1-sunilvl@ventanamicro.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/irqchip/irq-riscv-intc.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c
index 0cd6b48a5dbf9..627beae9649a2 100644
--- a/drivers/irqchip/irq-riscv-intc.c
+++ b/drivers/irqchip/irq-riscv-intc.c
@@ -232,8 +232,9 @@ IRQCHIP_DECLARE(andes, "andestech,cpu-intc", riscv_intc_init);
static int __init riscv_intc_acpi_init(union acpi_subtable_headers *header,
const unsigned long end)
{
- struct fwnode_handle *fn;
struct acpi_madt_rintc *rintc;
+ struct fwnode_handle *fn;
+ int rc;
rintc = (struct acpi_madt_rintc *)header;
@@ -252,7 +253,11 @@ static int __init riscv_intc_acpi_init(union acpi_subtable_headers *header,
return -ENOMEM;
}
- return riscv_intc_init_common(fn, &riscv_intc_chip);
+ rc = riscv_intc_init_common(fn, &riscv_intc_chip);
+ if (rc)
+ irq_domain_free_fwnode(fn);
+
+ return rc;
}
IRQCHIP_ACPI_DECLARE(riscv_intc, ACPI_MADT_TYPE_RINTC, NULL,
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 074/267] eventfs: Update all the eventfs_inodes from the events descriptor
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (72 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 073/267] irqchip/riscv-intc: Prevent memory leak when riscv_intc_init_common() fails Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 075/267] bpf: fix multi-uprobe PID filtering logic Greg Kroah-Hartman
` (204 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Masami Hiramatsu, Mark Rutland,
Mathieu Desnoyers, Andrew Morton, Masahiro Yamada,
Steven Rostedt (Google), Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (Google) <rostedt@goodmis.org>
[ Upstream commit 340f0c7067a95281ad13734f8225f49c6cf52067 ]
The change to update the permissions of the eventfs_inode had the
misconception that using the tracefs_inode would find all the
eventfs_inodes that have been updated and reset them on remount.
The problem with this approach is that the eventfs_inodes are freed when
they are no longer used (basically the reason the eventfs system exists).
When they are freed, the updated eventfs_inodes are not reset on a remount
because their tracefs_inodes have been freed.
Instead, since the events directory eventfs_inode always has a
tracefs_inode pointing to it (it is not freed when finished), and the
events directory has a link to all its children, have the
eventfs_remount() function only operate on the events eventfs_inode and
have it descend into its children updating their uid and gids.
Link: https://lore.kernel.org/all/CAK7LNARXgaWw3kH9JgrnH4vK6fr8LDkNKf3wq8NhMWJrVwJyVQ@mail.gmail.com/
Link: https://lore.kernel.org/linux-trace-kernel/20240523051539.754424703@goodmis.org
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Fixes: baa23a8d4360d ("tracefs: Reset permissions on remount if permissions are options")
Reported-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/tracefs/event_inode.c | 51 ++++++++++++++++++++++++++++++----------
1 file changed, 39 insertions(+), 12 deletions(-)
diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index b521e904a7ce9..b406bb3430f3d 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -305,33 +305,60 @@ static const struct file_operations eventfs_file_operations = {
.llseek = generic_file_llseek,
};
-/*
- * On a remount of tracefs, if UID or GID options are set, then
- * the mount point inode permissions should be used.
- * Reset the saved permission flags appropriately.
- */
-void eventfs_remount(struct tracefs_inode *ti, bool update_uid, bool update_gid)
+static void eventfs_set_attrs(struct eventfs_inode *ei, bool update_uid, kuid_t uid,
+ bool update_gid, kgid_t gid, int level)
{
- struct eventfs_inode *ei = ti->private;
+ struct eventfs_inode *ei_child;
- if (!ei)
+ /* Update events/<system>/<event> */
+ if (WARN_ON_ONCE(level > 3))
return;
- if (update_uid)
+ if (update_uid) {
ei->attr.mode &= ~EVENTFS_SAVE_UID;
+ ei->attr.uid = uid;
+ }
- if (update_gid)
+ if (update_gid) {
ei->attr.mode &= ~EVENTFS_SAVE_GID;
+ ei->attr.gid = gid;
+ }
+
+ list_for_each_entry(ei_child, &ei->children, list) {
+ eventfs_set_attrs(ei_child, update_uid, uid, update_gid, gid, level + 1);
+ }
if (!ei->entry_attrs)
return;
for (int i = 0; i < ei->nr_entries; i++) {
- if (update_uid)
+ if (update_uid) {
ei->entry_attrs[i].mode &= ~EVENTFS_SAVE_UID;
- if (update_gid)
+ ei->entry_attrs[i].uid = uid;
+ }
+ if (update_gid) {
ei->entry_attrs[i].mode &= ~EVENTFS_SAVE_GID;
+ ei->entry_attrs[i].gid = gid;
+ }
}
+
+}
+
+/*
+ * On a remount of tracefs, if UID or GID options are set, then
+ * the mount point inode permissions should be used.
+ * Reset the saved permission flags appropriately.
+ */
+void eventfs_remount(struct tracefs_inode *ti, bool update_uid, bool update_gid)
+{
+ struct eventfs_inode *ei = ti->private;
+
+ /* Only the events directory does the updates */
+ if (!ei || !ei->is_events || ei->is_freed)
+ return;
+
+ eventfs_set_attrs(ei, update_uid, ti->vfs_inode.i_uid,
+ update_gid, ti->vfs_inode.i_gid, 0);
}
/* Return the evenfs_inode of the "events" directory */
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 075/267] bpf: fix multi-uprobe PID filtering logic
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (73 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 074/267] eventfs: Update all the eventfs_inodes from the events descriptor Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 076/267] nilfs2: return the mapped address from nilfs_get_page() Greg Kroah-Hartman
` (203 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Jiri Olsa, Andrii Nakryiko,
Alexei Starovoitov, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrii Nakryiko <andrii@kernel.org>
[ Upstream commit 46ba0e49b64232adac35a2bc892f1710c5b0fb7f ]
Current implementation of PID filtering logic for multi-uprobes in
uprobe_prog_run() is filtering down to exact *thread*, while the intent
for PID filtering it to filter by *process* instead. The check in
uprobe_prog_run() also differs from the analogous one in
uprobe_multi_link_filter() for some reason. The latter is correct,
checking task->mm, not the task itself.
Fix the check in uprobe_prog_run() to perform the same task->mm check.
While doing this, we also update get_pid_task() use to use PIDTYPE_TGID
type of lookup, given the intent is to get a representative task of an
entire process. This doesn't change behavior, but seems more logical. It
would hold task group leader task now, not any random thread task.
Last but not least, given multi-uprobe support is half-broken due to
this PID filtering logic (depending on whether PID filtering is
important or not), we need to make it easy for user space consumers
(including libbpf) to easily detect whether PID filtering logic was
already fixed.
We do it here by adding an early check on passed pid parameter. If it's
negative (and so has no chance of being a valid PID), we return -EINVAL.
Previous behavior would eventually return -ESRCH ("No process found"),
given there can't be any process with negative PID. This subtle change
won't make any practical change in behavior, but will allow applications
to detect PID filtering fixes easily. Libbpf fixes take advantage of
this in the next patch.
Cc: stable@vger.kernel.org
Acked-by: Jiri Olsa <jolsa@kernel.org>
Fixes: b733eeade420 ("bpf: Add pid filter support for uprobe_multi link")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240521163401.3005045-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
kernel/trace/bpf_trace.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 8edbafe0d4cdf..cc29bf49f7159 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -3099,7 +3099,7 @@ static int uprobe_prog_run(struct bpf_uprobe *uprobe,
struct bpf_run_ctx *old_run_ctx;
int err = 0;
- if (link->task && current != link->task)
+ if (link->task && current->mm != link->task->mm)
return 0;
if (sleepable)
@@ -3200,8 +3200,9 @@ int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
upath = u64_to_user_ptr(attr->link_create.uprobe_multi.path);
uoffsets = u64_to_user_ptr(attr->link_create.uprobe_multi.offsets);
cnt = attr->link_create.uprobe_multi.cnt;
+ pid = attr->link_create.uprobe_multi.pid;
- if (!upath || !uoffsets || !cnt)
+ if (!upath || !uoffsets || !cnt || pid < 0)
return -EINVAL;
if (cnt > MAX_UPROBE_MULTI_CNT)
return -E2BIG;
@@ -3225,10 +3226,9 @@ int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
goto error_path_put;
}
- pid = attr->link_create.uprobe_multi.pid;
if (pid) {
rcu_read_lock();
- task = get_pid_task(find_vpid(pid), PIDTYPE_PID);
+ task = get_pid_task(find_vpid(pid), PIDTYPE_TGID);
rcu_read_unlock();
if (!task) {
err = -ESRCH;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 076/267] nilfs2: return the mapped address from nilfs_get_page()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (74 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 075/267] bpf: fix multi-uprobe PID filtering logic Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 077/267] nilfs2: fix nilfs_empty_dir() misjudgment and long loop on I/O errors Greg Kroah-Hartman
` (202 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Matthew Wilcox (Oracle),
Ryusuke Konishi, Andrew Morton, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthew Wilcox (Oracle) <willy@infradead.org>
[ Upstream commit 09a46acb3697e50548bb265afa1d79163659dd85 ]
In prepartion for switching from kmap() to kmap_local(), return the kmap
address from nilfs_get_page() instead of having the caller look up
page_address().
[konishi.ryusuke: fixed a missing blank line after declaration]
Link: https://lkml.kernel.org/r/20231127143036.2425-7-konishi.ryusuke@gmail.com
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: 7373a51e7998 ("nilfs2: fix nilfs_empty_dir() misjudgment and long loop on I/O errors")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/nilfs2/dir.c | 57 +++++++++++++++++++++++--------------------------
1 file changed, 27 insertions(+), 30 deletions(-)
diff --git a/fs/nilfs2/dir.c b/fs/nilfs2/dir.c
index 929edc0b101a0..c6b88be8a9d73 100644
--- a/fs/nilfs2/dir.c
+++ b/fs/nilfs2/dir.c
@@ -186,19 +186,24 @@ static bool nilfs_check_page(struct page *page)
return false;
}
-static struct page *nilfs_get_page(struct inode *dir, unsigned long n)
+static void *nilfs_get_page(struct inode *dir, unsigned long n,
+ struct page **pagep)
{
struct address_space *mapping = dir->i_mapping;
struct page *page = read_mapping_page(mapping, n, NULL);
+ void *kaddr;
- if (!IS_ERR(page)) {
- kmap(page);
- if (unlikely(!PageChecked(page))) {
- if (!nilfs_check_page(page))
- goto fail;
- }
+ if (IS_ERR(page))
+ return page;
+
+ kaddr = kmap(page);
+ if (unlikely(!PageChecked(page))) {
+ if (!nilfs_check_page(page))
+ goto fail;
}
- return page;
+
+ *pagep = page;
+ return kaddr;
fail:
nilfs_put_page(page);
@@ -275,14 +280,14 @@ static int nilfs_readdir(struct file *file, struct dir_context *ctx)
for ( ; n < npages; n++, offset = 0) {
char *kaddr, *limit;
struct nilfs_dir_entry *de;
- struct page *page = nilfs_get_page(inode, n);
+ struct page *page;
- if (IS_ERR(page)) {
+ kaddr = nilfs_get_page(inode, n, &page);
+ if (IS_ERR(kaddr)) {
nilfs_error(sb, "bad page in #%lu", inode->i_ino);
ctx->pos += PAGE_SIZE - offset;
return -EIO;
}
- kaddr = page_address(page);
de = (struct nilfs_dir_entry *)(kaddr + offset);
limit = kaddr + nilfs_last_byte(inode, n) -
NILFS_DIR_REC_LEN(1);
@@ -345,11 +350,9 @@ nilfs_find_entry(struct inode *dir, const struct qstr *qstr,
start = 0;
n = start;
do {
- char *kaddr;
+ char *kaddr = nilfs_get_page(dir, n, &page);
- page = nilfs_get_page(dir, n);
- if (!IS_ERR(page)) {
- kaddr = page_address(page);
+ if (!IS_ERR(kaddr)) {
de = (struct nilfs_dir_entry *)kaddr;
kaddr += nilfs_last_byte(dir, n) - reclen;
while ((char *) de <= kaddr) {
@@ -387,15 +390,11 @@ nilfs_find_entry(struct inode *dir, const struct qstr *qstr,
struct nilfs_dir_entry *nilfs_dotdot(struct inode *dir, struct page **p)
{
- struct page *page = nilfs_get_page(dir, 0);
- struct nilfs_dir_entry *de = NULL;
+ struct nilfs_dir_entry *de = nilfs_get_page(dir, 0, p);
- if (!IS_ERR(page)) {
- de = nilfs_next_entry(
- (struct nilfs_dir_entry *)page_address(page));
- *p = page;
- }
- return de;
+ if (IS_ERR(de))
+ return NULL;
+ return nilfs_next_entry(de);
}
ino_t nilfs_inode_by_name(struct inode *dir, const struct qstr *qstr)
@@ -459,12 +458,11 @@ int nilfs_add_link(struct dentry *dentry, struct inode *inode)
for (n = 0; n <= npages; n++) {
char *dir_end;
- page = nilfs_get_page(dir, n);
- err = PTR_ERR(page);
- if (IS_ERR(page))
+ kaddr = nilfs_get_page(dir, n, &page);
+ err = PTR_ERR(kaddr);
+ if (IS_ERR(kaddr))
goto out;
lock_page(page);
- kaddr = page_address(page);
dir_end = kaddr + nilfs_last_byte(dir, n);
de = (struct nilfs_dir_entry *)kaddr;
kaddr += PAGE_SIZE - reclen;
@@ -627,11 +625,10 @@ int nilfs_empty_dir(struct inode *inode)
char *kaddr;
struct nilfs_dir_entry *de;
- page = nilfs_get_page(inode, i);
- if (IS_ERR(page))
+ kaddr = nilfs_get_page(inode, i, &page);
+ if (IS_ERR(kaddr))
continue;
- kaddr = page_address(page);
de = (struct nilfs_dir_entry *)kaddr;
kaddr += nilfs_last_byte(inode, i) - NILFS_DIR_REC_LEN(1);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 077/267] nilfs2: fix nilfs_empty_dir() misjudgment and long loop on I/O errors
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (75 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 076/267] nilfs2: return the mapped address from nilfs_get_page() Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 078/267] io_uring/rsrc: dont lock while !TASK_RUNNING Greg Kroah-Hartman
` (201 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Ryusuke Konishi,
syzbot+c8166c541d3971bf6c87, Andrew Morton, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ryusuke Konishi <konishi.ryusuke@gmail.com>
[ Upstream commit 7373a51e7998b508af7136530f3a997b286ce81c ]
The error handling in nilfs_empty_dir() when a directory folio/page read
fails is incorrect, as in the old ext2 implementation, and if the
folio/page cannot be read or nilfs_check_folio() fails, it will falsely
determine the directory as empty and corrupt the file system.
In addition, since nilfs_empty_dir() does not immediately return on a
failed folio/page read, but continues to loop, this can cause a long loop
with I/O if i_size of the directory's inode is also corrupted, causing the
log writer thread to wait and hang, as reported by syzbot.
Fix these issues by making nilfs_empty_dir() immediately return a false
value (0) if it fails to get a directory folio/page.
Link: https://lkml.kernel.org/r/20240604134255.7165-1-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reported-by: syzbot+c8166c541d3971bf6c87@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=c8166c541d3971bf6c87
Fixes: 2ba466d74ed7 ("nilfs2: directory entry operations")
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/nilfs2/dir.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nilfs2/dir.c b/fs/nilfs2/dir.c
index c6b88be8a9d73..23a8357f127bc 100644
--- a/fs/nilfs2/dir.c
+++ b/fs/nilfs2/dir.c
@@ -627,7 +627,7 @@ int nilfs_empty_dir(struct inode *inode)
kaddr = nilfs_get_page(inode, i, &page);
if (IS_ERR(kaddr))
- continue;
+ return 0;
de = (struct nilfs_dir_entry *)kaddr;
kaddr += nilfs_last_byte(inode, i) - NILFS_DIR_REC_LEN(1);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 078/267] io_uring/rsrc: dont lock while !TASK_RUNNING
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (76 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 077/267] nilfs2: fix nilfs_empty_dir() misjudgment and long loop on I/O errors Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 079/267] io_uring: check for non-NULL file pointer in io_file_can_poll() Greg Kroah-Hartman
` (200 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Li Shi, Pavel Begunkov, Jens Axboe
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pavel Begunkov <asml.silence@gmail.com>
commit 54559642b96116b45e4b5ca7fd9f7835b8561272 upstream.
There is a report of io_rsrc_ref_quiesce() locking a mutex while not
TASK_RUNNING, which is due to forgetting restoring the state back after
io_run_task_work_sig() and attempts to break out of the waiting loop.
do not call blocking ops when !TASK_RUNNING; state=1 set at
[<ffffffff815d2494>] prepare_to_wait+0xa4/0x380
kernel/sched/wait.c:237
WARNING: CPU: 2 PID: 397056 at kernel/sched/core.c:10099
__might_sleep+0x114/0x160 kernel/sched/core.c:10099
RIP: 0010:__might_sleep+0x114/0x160 kernel/sched/core.c:10099
Call Trace:
<TASK>
__mutex_lock_common kernel/locking/mutex.c:585 [inline]
__mutex_lock+0xb4/0x940 kernel/locking/mutex.c:752
io_rsrc_ref_quiesce+0x590/0x940 io_uring/rsrc.c:253
io_sqe_buffers_unregister+0xa2/0x340 io_uring/rsrc.c:799
__io_uring_register io_uring/register.c:424 [inline]
__do_sys_io_uring_register+0x5b9/0x2400 io_uring/register.c:613
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xd8/0x270 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x6f/0x77
Reported-by: Li Shi <sl1589472800@gmail.com>
Fixes: 4ea15b56f0810 ("io_uring/rsrc: use wq for quiescing")
Cc: stable@vger.kernel.org
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/77966bc104e25b0534995d5dbb152332bc8f31c0.1718196953.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
io_uring/rsrc.c | 1 +
1 file changed, 1 insertion(+)
--- a/io_uring/rsrc.c
+++ b/io_uring/rsrc.c
@@ -250,6 +250,7 @@ __cold static int io_rsrc_ref_quiesce(st
ret = io_run_task_work_sig(ctx);
if (ret < 0) {
+ __set_current_state(TASK_RUNNING);
mutex_lock(&ctx->uring_lock);
if (list_empty(&ctx->rsrc_ref_list))
ret = 0;
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 079/267] io_uring: check for non-NULL file pointer in io_file_can_poll()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (77 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 078/267] io_uring/rsrc: dont lock while !TASK_RUNNING Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 080/267] USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages Greg Kroah-Hartman
` (199 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Jens Axboe
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jens Axboe <axboe@kernel.dk>
commit 5fc16fa5f13b3c06fdb959ef262050bd810416a2 upstream.
In earlier kernels, it was possible to trigger a NULL pointer
dereference off the forced async preparation path, if no file had
been assigned. The trace leading to that looks as follows:
BUG: kernel NULL pointer dereference, address: 00000000000000b0
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP
CPU: 67 PID: 1633 Comm: buf-ring-invali Not tainted 6.8.0-rc3+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS unknown 2/2/2022
RIP: 0010:io_buffer_select+0xc3/0x210
Code: 00 00 48 39 d1 0f 82 ae 00 00 00 48 81 4b 48 00 00 01 00 48 89 73 70 0f b7 50 0c 66 89 53 42 85 ed 0f 85 d2 00 00 00 48 8b 13 <48> 8b 92 b0 00 00 00 48 83 7a 40 00 0f 84 21 01 00 00 4c 8b 20 5b
RSP: 0018:ffffb7bec38c7d88 EFLAGS: 00010246
RAX: ffff97af2be61000 RBX: ffff97af234f1700 RCX: 0000000000000040
RDX: 0000000000000000 RSI: ffff97aecfb04820 RDI: ffff97af234f1700
RBP: 0000000000000000 R08: 0000000000200030 R09: 0000000000000020
R10: ffffb7bec38c7dc8 R11: 000000000000c000 R12: ffffb7bec38c7db8
R13: ffff97aecfb05800 R14: ffff97aecfb05800 R15: ffff97af2be5e000
FS: 00007f852f74b740(0000) GS:ffff97b1eeec0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000b0 CR3: 000000016deab005 CR4: 0000000000370ef0
Call Trace:
<TASK>
? __die+0x1f/0x60
? page_fault_oops+0x14d/0x420
? do_user_addr_fault+0x61/0x6a0
? exc_page_fault+0x6c/0x150
? asm_exc_page_fault+0x22/0x30
? io_buffer_select+0xc3/0x210
__io_import_iovec+0xb5/0x120
io_readv_prep_async+0x36/0x70
io_queue_sqe_fallback+0x20/0x260
io_submit_sqes+0x314/0x630
__do_sys_io_uring_enter+0x339/0xbc0
? __do_sys_io_uring_register+0x11b/0xc50
? vm_mmap_pgoff+0xce/0x160
do_syscall_64+0x5f/0x180
entry_SYSCALL_64_after_hwframe+0x46/0x4e
RIP: 0033:0x55e0a110a67e
Code: ba cc 00 00 00 45 31 c0 44 0f b6 92 d0 00 00 00 31 d2 41 b9 08 00 00 00 41 83 e2 01 41 c1 e2 04 41 09 c2 b8 aa 01 00 00 0f 05 <c3> 90 89 30 eb a9 0f 1f 40 00 48 8b 42 20 8b 00 a8 06 75 af 85 f6
because the request is marked forced ASYNC and has a bad file fd, and
hence takes the forced async prep path.
Current kernels with the request async prep cleaned up can no longer hit
this issue, but for ease of backporting, let's add this safety check in
here too as it really doesn't hurt. For both cases, this will inevitably
end with a CQE posted with -EBADF.
Cc: stable@vger.kernel.org
Fixes: a76c0b31eef5 ("io_uring: commit non-pollable provided mapped buffers upfront")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
io_uring/kbuf.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/io_uring/kbuf.c
+++ b/io_uring/kbuf.c
@@ -168,7 +168,8 @@ static void __user *io_ring_buffer_selec
req->buf_list = bl;
req->buf_index = buf->bid;
- if (issue_flags & IO_URING_F_UNLOCKED || !file_can_poll(req->file)) {
+ if (issue_flags & IO_URING_F_UNLOCKED ||
+ (req->file && !file_can_poll(req->file))) {
/*
* If we came in unlocked, we have no choice but to consume the
* buffer here, otherwise nothing ensures that the buffer won't
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 080/267] USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (78 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 079/267] io_uring: check for non-NULL file pointer in io_file_can_poll() Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 081/267] USB: xen-hcd: Traverse host/ when CONFIG_USB_XEN_HCD is selected Greg Kroah-Hartman
` (198 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Alan Stern,
syzbot+5f996b83575ef4058638, syzbot+1b2abad17596ad03dcff
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alan Stern <stern@rowland.harvard.edu>
commit 22f00812862564b314784167a89f27b444f82a46 upstream.
The syzbot fuzzer found that the interrupt-URB completion callback in
the cdc-wdm driver was taking too long, and the driver's immediate
resubmission of interrupt URBs with -EPROTO status combined with the
dummy-hcd emulation to cause a CPU lockup:
cdc_wdm 1-1:1.0: nonzero urb status received: -71
cdc_wdm 1-1:1.0: wdm_int_callback - 0 bytes
watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [syz-executor782:6625]
CPU#0 Utilization every 4s during lockup:
#1: 98% system, 0% softirq, 3% hardirq, 0% idle
#2: 98% system, 0% softirq, 3% hardirq, 0% idle
#3: 98% system, 0% softirq, 3% hardirq, 0% idle
#4: 98% system, 0% softirq, 3% hardirq, 0% idle
#5: 98% system, 1% softirq, 3% hardirq, 0% idle
Modules linked in:
irq event stamp: 73096
hardirqs last enabled at (73095): [<ffff80008037bc00>] console_emit_next_record kernel/printk/printk.c:2935 [inline]
hardirqs last enabled at (73095): [<ffff80008037bc00>] console_flush_all+0x650/0xb74 kernel/printk/printk.c:2994
hardirqs last disabled at (73096): [<ffff80008af10b00>] __el1_irq arch/arm64/kernel/entry-common.c:533 [inline]
hardirqs last disabled at (73096): [<ffff80008af10b00>] el1_interrupt+0x24/0x68 arch/arm64/kernel/entry-common.c:551
softirqs last enabled at (73048): [<ffff8000801ea530>] softirq_handle_end kernel/softirq.c:400 [inline]
softirqs last enabled at (73048): [<ffff8000801ea530>] handle_softirqs+0xa60/0xc34 kernel/softirq.c:582
softirqs last disabled at (73043): [<ffff800080020de8>] __do_softirq+0x14/0x20 kernel/softirq.c:588
CPU: 0 PID: 6625 Comm: syz-executor782 Tainted: G W 6.10.0-rc2-syzkaller-g8867bbd4a056 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
Testing showed that the problem did not occur if the two error
messages -- the first two lines above -- were removed; apparently adding
material to the kernel log takes a surprisingly large amount of time.
In any case, the best approach for preventing these lockups and to
avoid spamming the log with thousands of error messages per second is
to ratelimit the two dev_err() calls. Therefore we replace them with
dev_err_ratelimited().
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Suggested-by: Greg KH <gregkh@linuxfoundation.org>
Reported-and-tested-by: syzbot+5f996b83575ef4058638@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-usb/00000000000073d54b061a6a1c65@google.com/
Reported-and-tested-by: syzbot+1b2abad17596ad03dcff@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-usb/000000000000f45085061aa9b37e@google.com/
Fixes: 9908a32e94de ("USB: remove err() macro from usb class drivers")
Link: https://lore.kernel.org/linux-usb/40dfa45b-5f21-4eef-a8c1-51a2f320e267@rowland.harvard.edu/
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/29855215-52f5-4385-b058-91f42c2bee18@rowland.harvard.edu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/usb/class/cdc-wdm.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/usb/class/cdc-wdm.c
+++ b/drivers/usb/class/cdc-wdm.c
@@ -266,14 +266,14 @@ static void wdm_int_callback(struct urb
dev_err(&desc->intf->dev, "Stall on int endpoint\n");
goto sw; /* halt is cleared in work */
default:
- dev_err(&desc->intf->dev,
+ dev_err_ratelimited(&desc->intf->dev,
"nonzero urb status received: %d\n", status);
break;
}
}
if (urb->actual_length < sizeof(struct usb_cdc_notification)) {
- dev_err(&desc->intf->dev, "wdm_int_callback - %d bytes\n",
+ dev_err_ratelimited(&desc->intf->dev, "wdm_int_callback - %d bytes\n",
urb->actual_length);
goto exit;
}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 081/267] USB: xen-hcd: Traverse host/ when CONFIG_USB_XEN_HCD is selected
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (79 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 080/267] USB: class: cdc-wdm: Fix CPU lockup caused by excessive log messages Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 082/267] usb: typec: tcpm: fix use-after-free case in tcpm_register_source_caps Greg Kroah-Hartman
` (197 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, John Ernberg
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Ernberg <john.ernberg@actia.se>
commit 8475ffcfb381a77075562207ce08552414a80326 upstream.
If no other USB HCDs are selected when compiling a small pure virutal
machine, the Xen HCD driver cannot be built.
Fix it by traversing down host/ if CONFIG_USB_XEN_HCD is selected.
Fixes: 494ed3997d75 ("usb: Introduce Xen pvUSB frontend (xen hcd)")
Cc: stable@vger.kernel.org # v5.17+
Signed-off-by: John Ernberg <john.ernberg@actia.se>
Link: https://lore.kernel.org/r/20240517114345.1190755-1-john.ernberg@actia.se
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/usb/Makefile | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/usb/Makefile
+++ b/drivers/usb/Makefile
@@ -35,6 +35,7 @@ obj-$(CONFIG_USB_R8A66597_HCD) += host/
obj-$(CONFIG_USB_FSL_USB2) += host/
obj-$(CONFIG_USB_FOTG210_HCD) += host/
obj-$(CONFIG_USB_MAX3421_HCD) += host/
+obj-$(CONFIG_USB_XEN_HCD) += host/
obj-$(CONFIG_USB_C67X00_HCD) += c67x00/
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 082/267] usb: typec: tcpm: fix use-after-free case in tcpm_register_source_caps
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (80 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 081/267] USB: xen-hcd: Traverse host/ when CONFIG_USB_XEN_HCD is selected Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 083/267] usb: typec: tcpm: Ignore received Hard Reset in TOGGLING state Greg Kroah-Hartman
` (196 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Amit Sunil Dhamne, Ondrej Jirman,
Heikki Krogerus, Dmitry Baryshkov
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Amit Sunil Dhamne <amitsd@google.com>
commit e7e921918d905544500ca7a95889f898121ba886 upstream.
There could be a potential use-after-free case in
tcpm_register_source_caps(). This could happen when:
* new (say invalid) source caps are advertised
* the existing source caps are unregistered
* tcpm_register_source_caps() returns with an error as
usb_power_delivery_register_capabilities() fails
This causes port->partner_source_caps to hold on to the now freed source
caps.
Reset port->partner_source_caps value to NULL after unregistering
existing source caps.
Fixes: 230ecdf71a64 ("usb: typec: tcpm: unregister existing source caps before re-registration")
Cc: stable@vger.kernel.org
Signed-off-by: Amit Sunil Dhamne <amitsd@google.com>
Reviewed-by: Ondrej Jirman <megi@xff.cz>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20240514220134.2143181-1-amitsd@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/usb/typec/tcpm/tcpm.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/usb/typec/tcpm/tcpm.c
+++ b/drivers/usb/typec/tcpm/tcpm.c
@@ -2436,8 +2436,10 @@ static int tcpm_register_sink_caps(struc
memcpy(caps.pdo, port->sink_caps, sizeof(u32) * port->nr_sink_caps);
caps.role = TYPEC_SINK;
- if (cap)
+ if (cap) {
usb_power_delivery_unregister_capabilities(cap);
+ port->partner_source_caps = NULL;
+ }
cap = usb_power_delivery_register_capabilities(port->partner_pd, &caps);
if (IS_ERR(cap))
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 083/267] usb: typec: tcpm: Ignore received Hard Reset in TOGGLING state
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (81 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 082/267] usb: typec: tcpm: fix use-after-free case in tcpm_register_source_caps Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 084/267] mei: me: release irq in mei_me_pci_resume error path Greg Kroah-Hartman
` (195 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Kyle Tso, Heikki Krogerus
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kyle Tso <kyletso@google.com>
commit fc8fb9eea94d8f476e15f3a4a7addeb16b3b99d6 upstream.
Similar to what fixed in Commit a6fe37f428c1 ("usb: typec: tcpm: Skip
hard reset when in error recovery"), the handling of the received Hard
Reset has to be skipped during TOGGLING state.
[ 4086.021288] VBUS off
[ 4086.021295] pending state change SNK_READY -> SNK_UNATTACHED @ 650 ms [rev2 NONE_AMS]
[ 4086.022113] VBUS VSAFE0V
[ 4086.022117] state change SNK_READY -> SNK_UNATTACHED [rev2 NONE_AMS]
[ 4086.022447] VBUS off
[ 4086.022450] state change SNK_UNATTACHED -> SNK_UNATTACHED [rev2 NONE_AMS]
[ 4086.023060] VBUS VSAFE0V
[ 4086.023064] state change SNK_UNATTACHED -> SNK_UNATTACHED [rev2 NONE_AMS]
[ 4086.023070] disable BIST MODE TESTDATA
[ 4086.023766] disable vbus discharge ret:0
[ 4086.023911] Setting usb_comm capable false
[ 4086.028874] Setting voltage/current limit 0 mV 0 mA
[ 4086.028888] polarity 0
[ 4086.030305] Requesting mux state 0, usb-role 0, orientation 0
[ 4086.033539] Start toggling
[ 4086.038496] state change SNK_UNATTACHED -> TOGGLING [rev2 NONE_AMS]
// This Hard Reset is unexpected
[ 4086.038499] Received hard reset
[ 4086.038501] state change TOGGLING -> HARD_RESET_START [rev2 HARD_RESET]
Fixes: f0690a25a140 ("staging: typec: USB Type-C Port Manager (tcpm)")
Cc: stable@vger.kernel.org
Signed-off-by: Kyle Tso <kyletso@google.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20240520154858.1072347-1-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/usb/typec/tcpm/tcpm.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/usb/typec/tcpm/tcpm.c
+++ b/drivers/usb/typec/tcpm/tcpm.c
@@ -5415,6 +5415,7 @@ static void _tcpm_pd_hard_reset(struct t
port->tcpc->set_bist_data(port->tcpc, false);
switch (port->state) {
+ case TOGGLING:
case ERROR_RECOVERY:
case PORT_RESET:
case PORT_RESET_WAIT_OFF:
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 084/267] mei: me: release irq in mei_me_pci_resume error path
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (82 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 083/267] usb: typec: tcpm: Ignore received Hard Reset in TOGGLING state Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 085/267] tty: n_tty: Fix buffer offsets when lookahead is used Greg Kroah-Hartman
` (194 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, stable, Tomas Winkler
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tomas Winkler <tomas.winkler@intel.com>
commit 283cb234ef95d94c61f59e1cd070cd9499b51292 upstream.
The mei_me_pci_resume doesn't release irq on the error path,
in case mei_start() fails.
Cc: <stable@kernel.org>
Fixes: 33ec08263147 ("mei: revamp mei reset state machine")
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Link: https://lore.kernel.org/r/20240604090728.1027307-1-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/misc/mei/pci-me.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/misc/mei/pci-me.c
+++ b/drivers/misc/mei/pci-me.c
@@ -400,8 +400,10 @@ static int mei_me_pci_resume(struct devi
}
err = mei_restart(dev);
- if (err)
+ if (err) {
+ free_irq(pdev->irq, dev);
return err;
+ }
/* Start timer if stopped in suspend */
schedule_delayed_work(&dev->timer_work, HZ);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 085/267] tty: n_tty: Fix buffer offsets when lookahead is used
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (83 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 084/267] mei: me: release irq in mei_me_pci_resume error path Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 086/267] serial: port: Dont block system suspend even if bytes are left to xmit Greg Kroah-Hartman
` (193 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Vadym Krevs, Ilpo Järvinen
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
commit b19ab7ee2c4c1ec5f27c18413c3ab63907f7d55c upstream.
When lookahead has "consumed" some characters (la_count > 0),
n_tty_receive_buf_standard() and n_tty_receive_buf_closing() for
characters beyond the la_count are given wrong cp/fp offsets which
leads to duplicating and losing some characters.
If la_count > 0, correct buffer pointers and make count consistent too
(the latter is not strictly necessary to fix the issue but seems more
logical to adjust all variables immediately to keep state consistent).
Reported-by: Vadym Krevs <vkrevs@yahoo.com>
Fixes: 6bb6fa6908eb ("tty: Implement lookahead to process XON/XOFF timely")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218834
Tested-by: Vadym Krevs <vkrevs@yahoo.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20240514140429.12087-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/tty/n_tty.c | 22 ++++++++++++++++------
1 file changed, 16 insertions(+), 6 deletions(-)
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -1624,15 +1624,25 @@ static void __receive_buf(struct tty_str
else if (ldata->raw || (L_EXTPROC(tty) && !preops))
n_tty_receive_buf_raw(tty, cp, fp, count);
else if (tty->closing && !L_EXTPROC(tty)) {
- if (la_count > 0)
+ if (la_count > 0) {
n_tty_receive_buf_closing(tty, cp, fp, la_count, true);
- if (count > la_count)
- n_tty_receive_buf_closing(tty, cp, fp, count - la_count, false);
+ cp += la_count;
+ if (fp)
+ fp += la_count;
+ count -= la_count;
+ }
+ if (count > 0)
+ n_tty_receive_buf_closing(tty, cp, fp, count, false);
} else {
- if (la_count > 0)
+ if (la_count > 0) {
n_tty_receive_buf_standard(tty, cp, fp, la_count, true);
- if (count > la_count)
- n_tty_receive_buf_standard(tty, cp, fp, count - la_count, false);
+ cp += la_count;
+ if (fp)
+ fp += la_count;
+ count -= la_count;
+ }
+ if (count > 0)
+ n_tty_receive_buf_standard(tty, cp, fp, count, false);
flush_echoes(tty);
if (tty->ops->flush_chars)
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 086/267] serial: port: Dont block system suspend even if bytes are left to xmit
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (84 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 085/267] tty: n_tty: Fix buffer offsets when lookahead is used Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 087/267] landlock: Fix d_parent walk Greg Kroah-Hartman
` (192 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Tony Lindgren, Douglas Anderson
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Douglas Anderson <dianders@chromium.org>
commit ca84cd379b45e9b1775b9e026f069a3a886b409d upstream.
Recently, suspend testing on sc7180-trogdor based devices has started
to sometimes fail with messages like this:
port a88000.serial:0.0: PM: calling pm_runtime_force_suspend+0x0/0xf8 @ 28934, parent: a88000.serial:0
port a88000.serial:0.0: PM: dpm_run_callback(): pm_runtime_force_suspend+0x0/0xf8 returns -16
port a88000.serial:0.0: PM: pm_runtime_force_suspend+0x0/0xf8 returned -16 after 33 usecs
port a88000.serial:0.0: PM: failed to suspend: error -16
I could reproduce these problems by logging in via an agetty on the
debug serial port (which was _not_ used for kernel console) and
running:
cat /var/log/messages
...and then (via an SSH session) forcing a few suspend/resume cycles.
Tracing through the code and doing some printf()-based debugging shows
that the -16 (-EBUSY) comes from the recently added
serial_port_runtime_suspend().
The idea of the serial_port_runtime_suspend() function is to prevent
the port from being _runtime_ suspended if it still has bytes left to
transmit. Having bytes left to transmit isn't a reason to block
_system_ suspend, though. If a serdev device in the kernel needs to
block system suspend it should block its own suspend and it can use
serdev_device_wait_until_sent() to ensure bytes are sent.
The DEFINE_RUNTIME_DEV_PM_OPS() used by the serial_port code means
that the system suspend function will be pm_runtime_force_suspend().
In pm_runtime_force_suspend() we can see that before calling the
runtime suspend function we'll call pm_runtime_disable(). This should
be a reliable way to detect that we're called from system suspend and
that we shouldn't look for busyness.
Fixes: 43066e32227e ("serial: port: Don't suspend if the port is still busy")
Cc: stable@vger.kernel.org
Reviewed-by: Tony Lindgren <tony.lindgren@linux.intel.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20240531080914.v3.1.I2395e66cf70c6e67d774c56943825c289b9c13e4@changeid
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/tty/serial/serial_port.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/drivers/tty/serial/serial_port.c
+++ b/drivers/tty/serial/serial_port.c
@@ -60,6 +60,13 @@ static int serial_port_runtime_suspend(s
if (port->flags & UPF_DEAD)
return 0;
+ /*
+ * Nothing to do on pm_runtime_force_suspend(), see
+ * DEFINE_RUNTIME_DEV_PM_OPS.
+ */
+ if (!pm_runtime_enabled(dev))
+ return 0;
+
uart_port_lock_irqsave(port, &flags);
if (!port_dev->tx_enabled) {
uart_port_unlock_irqrestore(port, flags);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 087/267] landlock: Fix d_parent walk
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (85 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 086/267] serial: port: Dont block system suspend even if bytes are left to xmit Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:53 ` [PATCH 6.6 088/267] jfs: xattr: fix buffer overflow for invalid xattr Greg Kroah-Hartman
` (191 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Günther Noack, Paul Moore,
syzbot+bf4903dc7e12b18ebc87, Mickaël Salaün
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mickaël Salaün <mic@digikod.net>
commit 88da52ccd66e65f2e63a6c35c9dff55d448ef4dc upstream.
The WARN_ON_ONCE() in collect_domain_accesses() can be triggered when
trying to link a root mount point. This cannot work in practice because
this directory is mounted, but the VFS check is done after the call to
security_path_link().
Do not use source directory's d_parent when the source directory is the
mount point.
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: stable@vger.kernel.org
Reported-by: syzbot+bf4903dc7e12b18ebc87@syzkaller.appspotmail.com
Fixes: b91c3e4ea756 ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER")
Closes: https://lore.kernel.org/r/000000000000553d3f0618198200@google.com
Link: https://lore.kernel.org/r/20240516181935.1645983-2-mic@digikod.net
[mic: Fix commit message]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
security/landlock/fs.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -820,6 +820,7 @@ static int current_check_refer_path(stru
bool allow_parent1, allow_parent2;
access_mask_t access_request_parent1, access_request_parent2;
struct path mnt_dir;
+ struct dentry *old_parent;
layer_mask_t layer_masks_parent1[LANDLOCK_NUM_ACCESS_FS] = {},
layer_masks_parent2[LANDLOCK_NUM_ACCESS_FS] = {};
@@ -867,9 +868,17 @@ static int current_check_refer_path(stru
mnt_dir.mnt = new_dir->mnt;
mnt_dir.dentry = new_dir->mnt->mnt_root;
+ /*
+ * old_dentry may be the root of the common mount point and
+ * !IS_ROOT(old_dentry) at the same time (e.g. with open_tree() and
+ * OPEN_TREE_CLONE). We do not need to call dget(old_parent) because
+ * we keep a reference to old_dentry.
+ */
+ old_parent = (old_dentry == mnt_dir.dentry) ? old_dentry :
+ old_dentry->d_parent;
+
/* new_dir->dentry is equal to new_dentry->d_parent */
- allow_parent1 = collect_domain_accesses(dom, mnt_dir.dentry,
- old_dentry->d_parent,
+ allow_parent1 = collect_domain_accesses(dom, mnt_dir.dentry, old_parent,
&layer_masks_parent1);
allow_parent2 = collect_domain_accesses(
dom, mnt_dir.dentry, new_dir->dentry, &layer_masks_parent2);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 088/267] jfs: xattr: fix buffer overflow for invalid xattr
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (86 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 087/267] landlock: Fix d_parent walk Greg Kroah-Hartman
@ 2024-06-19 12:53 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 089/267] xhci: Set correct transferred length for cancelled bulk transfers Greg Kroah-Hartman
` (190 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:53 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, syzbot+9dfe490c8176301c1d06,
Dave Kleikamp
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7c55b78818cfb732680c4a72ab270cc2d2ee3d0f upstream.
When an xattr size is not what is expected, it is printed out to the
kernel log in hex format as a form of debugging. But when that xattr
size is bigger than the expected size, printing it out can cause an
access off the end of the buffer.
Fix this all up by properly restricting the size of the debug hex dump
in the kernel log.
Reported-by: syzbot+9dfe490c8176301c1d06@syzkaller.appspotmail.com
Cc: Dave Kleikamp <shaggy@kernel.org>
Link: https://lore.kernel.org/r/2024051433-slider-cloning-98f9@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/jfs/xattr.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/fs/jfs/xattr.c
+++ b/fs/jfs/xattr.c
@@ -557,9 +557,11 @@ static int ea_get(struct inode *inode, s
size_check:
if (EALIST_SIZE(ea_buf->xattr) != ea_size) {
+ int size = min_t(int, EALIST_SIZE(ea_buf->xattr), ea_size);
+
printk(KERN_ERR "ea_get: invalid extended attribute\n");
print_hex_dump(KERN_ERR, "", DUMP_PREFIX_ADDRESS, 16, 1,
- ea_buf->xattr, ea_size, 1);
+ ea_buf->xattr, size, 1);
ea_release(inode, ea_buf);
rc = -EIO;
goto clean_up;
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 089/267] xhci: Set correct transferred length for cancelled bulk transfers
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (87 preceding siblings ...)
2024-06-19 12:53 ` [PATCH 6.6 088/267] jfs: xattr: fix buffer overflow for invalid xattr Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 090/267] xhci: Apply reset resume quirk to Etron EJ188 xHCI host Greg Kroah-Hartman
` (189 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Pierre Tomon, Alan Stern,
Mathias Nyman
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mathias Nyman <mathias.nyman@linux.intel.com>
commit f0260589b439e2637ad54a2b25f00a516ef28a57 upstream.
The transferred length is set incorrectly for cancelled bulk
transfer TDs in case the bulk transfer ring stops on the last transfer
block with a 'Stop - Length Invalid' completion code.
length essentially ends up being set to the requested length:
urb->actual_length = urb->transfer_buffer_length
Length for 'Stop - Length Invalid' cases should be the sum of all
TRB transfer block lengths up to the one the ring stopped on,
_excluding_ the one stopped on.
Fix this by always summing up TRB lengths for 'Stop - Length Invalid'
bulk cases.
This issue was discovered by Alan Stern while debugging
https://bugzilla.kernel.org/show_bug.cgi?id=218890, but does not
solve that bug. Issue is older than 4.10 kernel but fix won't apply
to those due to major reworks in that area.
Tested-by: Pierre Tomon <pierretom+12@ik.me>
Cc: stable@vger.kernel.org # v4.10+
Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20240611120610.3264502-2-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/usb/host/xhci-ring.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2525,9 +2525,8 @@ static int process_bulk_intr_td(struct x
goto finish_td;
case COMP_STOPPED_LENGTH_INVALID:
/* stopped on ep trb with invalid length, exclude it */
- ep_trb_len = 0;
- remaining = 0;
- break;
+ td->urb->actual_length = sum_trb_lengths(xhci, ep_ring, ep_trb);
+ goto finish_td;
case COMP_USB_TRANSACTION_ERROR:
if (xhci->quirks & XHCI_NO_SOFT_RETRY ||
(ep->err_count++ > MAX_SOFT_RETRY) ||
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 090/267] xhci: Apply reset resume quirk to Etron EJ188 xHCI host
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (88 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 089/267] xhci: Set correct transferred length for cancelled bulk transfers Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 091/267] xhci: Handle TD clearing for multiple streams case Greg Kroah-Hartman
` (188 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Kuangyi Chiang, Mathias Nyman
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuangyi Chiang <ki.chiang65@gmail.com>
commit 17bd54555c2aaecfdb38e2734149f684a73fa584 upstream.
As described in commit c877b3b2ad5c ("xhci: Add reset on resume quirk for
asrock p67 host"), EJ188 have the same issue as EJ168, where completely
dies on resume. So apply XHCI_RESET_ON_RESUME quirk to EJ188 as well.
Cc: stable@vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20240611120610.3264502-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/usb/host/xhci-pci.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -36,6 +36,7 @@
#define PCI_VENDOR_ID_ETRON 0x1b6f
#define PCI_DEVICE_ID_EJ168 0x7023
+#define PCI_DEVICE_ID_EJ188 0x7052
#define PCI_DEVICE_ID_INTEL_LYNXPOINT_XHCI 0x8c31
#define PCI_DEVICE_ID_INTEL_LYNXPOINT_LP_XHCI 0x9c31
@@ -461,6 +462,10 @@ static void xhci_pci_quirks(struct devic
xhci->quirks |= XHCI_TRUST_TX_LENGTH;
xhci->quirks |= XHCI_BROKEN_STREAMS;
}
+ if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
+ pdev->device == PCI_DEVICE_ID_EJ188)
+ xhci->quirks |= XHCI_RESET_ON_RESUME;
+
if (pdev->vendor == PCI_VENDOR_ID_RENESAS &&
pdev->device == 0x0014) {
xhci->quirks |= XHCI_TRUST_TX_LENGTH;
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 091/267] xhci: Handle TD clearing for multiple streams case
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (89 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 090/267] xhci: Apply reset resume quirk to Etron EJ188 xHCI host Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 092/267] xhci: Apply broken streams quirk to Etron EJ188 xHCI host Greg Kroah-Hartman
` (187 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, security, Neal Gompa, Hector Martin,
Mathias Nyman
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hector Martin <marcan@marcan.st>
commit 5ceac4402f5d975e5a01c806438eb4e554771577 upstream.
When multiple streams are in use, multiple TDs might be in flight when
an endpoint is stopped. We need to issue a Set TR Dequeue Pointer for
each, to ensure everything is reset properly and the caches cleared.
Change the logic so that any N>1 TDs found active for different streams
are deferred until after the first one is processed, calling
xhci_invalidate_cancelled_tds() again from xhci_handle_cmd_set_deq() to
queue another command until we are done with all of them. Also change
the error/"should never happen" paths to ensure we at least clear any
affected TDs, even if we can't issue a command to clear the hardware
cache, and complain loudly with an xhci_warn() if this ever happens.
This problem case dates back to commit e9df17eb1408 ("USB: xhci: Correct
assumptions about number of rings per endpoint.") early on in the XHCI
driver's life, when stream support was first added.
It was then identified but not fixed nor made into a warning in commit
674f8438c121 ("xhci: split handling halted endpoints into two steps"),
which added a FIXME comment for the problem case (without materially
changing the behavior as far as I can tell, though the new logic made
the problem more obvious).
Then later, in commit 94f339147fc3 ("xhci: Fix failure to give back some
cached cancelled URBs."), it was acknowledged again.
[Mathias: commit 94f339147fc3 ("xhci: Fix failure to give back some cached
cancelled URBs.") was a targeted regression fix to the previously mentioned
patch. Users reported issues with usb stuck after unmounting/disconnecting
UAS devices. This rolled back the TD clearing of multiple streams to its
original state.]
Apparently the commit author was aware of the problem (yet still chose
to submit it): It was still mentioned as a FIXME, an xhci_dbg() was
added to log the problem condition, and the remaining issue was mentioned
in the commit description. The choice of making the log type xhci_dbg()
for what is, at this point, a completely unhandled and known broken
condition is puzzling and unfortunate, as it guarantees that no actual
users would see the log in production, thereby making it nigh
undebuggable (indeed, even if you turn on DEBUG, the message doesn't
really hint at there being a problem at all).
It took me *months* of random xHC crashes to finally find a reliable
repro and be able to do a deep dive debug session, which could all have
been avoided had this unhandled, broken condition been actually reported
with a warning, as it should have been as a bug intentionally left in
unfixed (never mind that it shouldn't have been left in at all).
> Another fix to solve clearing the caches of all stream rings with
> cancelled TDs is needed, but not as urgent.
3 years after that statement and 14 years after the original bug was
introduced, I think it's finally time to fix it. And maybe next time
let's not leave bugs unfixed (that are actually worse than the original
bug), and let's actually get people to review kernel commits please.
Fixes xHC crashes and IOMMU faults with UAS devices when handling
errors/faults. Easiest repro is to use `hdparm` to mark an early sector
(e.g. 1024) on a disk as bad, then `cat /dev/sdX > /dev/null` in a loop.
At least in the case of JMicron controllers, the read errors end up
having to cancel two TDs (for two queued requests to different streams)
and the one that didn't get cleared properly ends up faulting the xHC
entirely when it tries to access DMA pages that have since been unmapped,
referred to by the stale TDs. This normally happens quickly (after two
or three loops). After this fix, I left the `cat` in a loop running
overnight and experienced no xHC failures, with all read errors
recovered properly. Repro'd and tested on an Apple M1 Mac Mini
(dwc3 host).
On systems without an IOMMU, this bug would instead silently corrupt
freed memory, making this a security bug (even on systems with IOMMUs
this could silently corrupt memory belonging to other USB devices on the
same controller, so it's still a security bug). Given that the kernel
autoprobes partition tables, I'm pretty sure a malicious USB device
pretending to be a UAS device and reporting an error with the right
timing could deliberately trigger a UAF and write to freed memory, with
no user action.
[Mathias: Commit message and code comment edit, original at:]
https://lore.kernel.org/linux-usb/20240524-xhci-streams-v1-1-6b1f13819bea@marcan.st/
Fixes: e9df17eb1408 ("USB: xhci: Correct assumptions about number of rings per endpoint.")
Fixes: 94f339147fc3 ("xhci: Fix failure to give back some cached cancelled URBs.")
Fixes: 674f8438c121 ("xhci: split handling halted endpoints into two steps")
Cc: stable@vger.kernel.org
Cc: security@kernel.org
Reviewed-by: Neal Gompa <neal@gompa.dev>
Signed-off-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20240611120610.3264502-5-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/usb/host/xhci-ring.c | 54 ++++++++++++++++++++++++++++++++++---------
drivers/usb/host/xhci.h | 1
2 files changed, 44 insertions(+), 11 deletions(-)
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -1027,13 +1027,27 @@ static int xhci_invalidate_cancelled_tds
break;
case TD_DIRTY: /* TD is cached, clear it */
case TD_HALTED:
+ case TD_CLEARING_CACHE_DEFERRED:
+ if (cached_td) {
+ if (cached_td->urb->stream_id != td->urb->stream_id) {
+ /* Multiple streams case, defer move dq */
+ xhci_dbg(xhci,
+ "Move dq deferred: stream %u URB %p\n",
+ td->urb->stream_id, td->urb);
+ td->cancel_status = TD_CLEARING_CACHE_DEFERRED;
+ break;
+ }
+
+ /* Should never happen, but clear the TD if it does */
+ xhci_warn(xhci,
+ "Found multiple active URBs %p and %p in stream %u?\n",
+ td->urb, cached_td->urb,
+ td->urb->stream_id);
+ td_to_noop(xhci, ring, cached_td, false);
+ cached_td->cancel_status = TD_CLEARED;
+ }
+
td->cancel_status = TD_CLEARING_CACHE;
- if (cached_td)
- /* FIXME stream case, several stopped rings */
- xhci_dbg(xhci,
- "Move dq past stream %u URB %p instead of stream %u URB %p\n",
- td->urb->stream_id, td->urb,
- cached_td->urb->stream_id, cached_td->urb);
cached_td = td;
break;
}
@@ -1053,10 +1067,16 @@ static int xhci_invalidate_cancelled_tds
if (err) {
/* Failed to move past cached td, just set cached TDs to no-op */
list_for_each_entry_safe(td, tmp_td, &ep->cancelled_td_list, cancelled_td_list) {
- if (td->cancel_status != TD_CLEARING_CACHE)
+ /*
+ * Deferred TDs need to have the deq pointer set after the above command
+ * completes, so if that failed we just give up on all of them (and
+ * complain loudly since this could cause issues due to caching).
+ */
+ if (td->cancel_status != TD_CLEARING_CACHE &&
+ td->cancel_status != TD_CLEARING_CACHE_DEFERRED)
continue;
- xhci_dbg(xhci, "Failed to clear cancelled cached URB %p, mark clear anyway\n",
- td->urb);
+ xhci_warn(xhci, "Failed to clear cancelled cached URB %p, mark clear anyway\n",
+ td->urb);
td_to_noop(xhci, ring, td, false);
td->cancel_status = TD_CLEARED;
}
@@ -1334,6 +1354,7 @@ static void xhci_handle_cmd_set_deq(stru
struct xhci_ep_ctx *ep_ctx;
struct xhci_slot_ctx *slot_ctx;
struct xhci_td *td, *tmp_td;
+ bool deferred = false;
ep_index = TRB_TO_EP_INDEX(le32_to_cpu(trb->generic.field[3]));
stream_id = TRB_TO_STREAM_ID(le32_to_cpu(trb->generic.field[2]));
@@ -1420,6 +1441,8 @@ static void xhci_handle_cmd_set_deq(stru
xhci_dbg(ep->xhci, "%s: Giveback cancelled URB %p TD\n",
__func__, td->urb);
xhci_td_cleanup(ep->xhci, td, ep_ring, td->status);
+ } else if (td->cancel_status == TD_CLEARING_CACHE_DEFERRED) {
+ deferred = true;
} else {
xhci_dbg(ep->xhci, "%s: Keep cancelled URB %p TD as cancel_status is %d\n",
__func__, td->urb, td->cancel_status);
@@ -1429,8 +1452,17 @@ cleanup:
ep->ep_state &= ~SET_DEQ_PENDING;
ep->queued_deq_seg = NULL;
ep->queued_deq_ptr = NULL;
- /* Restart any rings with pending URBs */
- ring_doorbell_for_active_rings(xhci, slot_id, ep_index);
+
+ if (deferred) {
+ /* We have more streams to clear */
+ xhci_dbg(ep->xhci, "%s: Pending TDs to clear, continuing with invalidation\n",
+ __func__);
+ xhci_invalidate_cancelled_tds(ep);
+ } else {
+ /* Restart any rings with pending URBs */
+ xhci_dbg(ep->xhci, "%s: All TDs cleared, ring doorbell\n", __func__);
+ ring_doorbell_for_active_rings(xhci, slot_id, ep_index);
+ }
}
static void xhci_handle_cmd_reset_ep(struct xhci_hcd *xhci, int slot_id,
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1559,6 +1559,7 @@ enum xhci_cancelled_td_status {
TD_DIRTY = 0,
TD_HALTED,
TD_CLEARING_CACHE,
+ TD_CLEARING_CACHE_DEFERRED,
TD_CLEARED,
};
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 092/267] xhci: Apply broken streams quirk to Etron EJ188 xHCI host
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (90 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 091/267] xhci: Handle TD clearing for multiple streams case Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 093/267] thunderbolt: debugfs: Fix margin debugfs node creation condition Greg Kroah-Hartman
` (186 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Kuangyi Chiang, Mathias Nyman
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuangyi Chiang <ki.chiang65@gmail.com>
commit 91f7a1524a92c70ffe264db8bdfa075f15bbbeb9 upstream.
As described in commit 8f873c1ff4ca ("xhci: Blacklist using streams on the
Etron EJ168 controller"), EJ188 have the same issue as EJ168, where Streams
do not work reliable on EJ188. So apply XHCI_BROKEN_STREAMS quirk to EJ188
as well.
Cc: stable@vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20240611120610.3264502-4-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/usb/host/xhci-pci.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -463,8 +463,10 @@ static void xhci_pci_quirks(struct devic
xhci->quirks |= XHCI_BROKEN_STREAMS;
}
if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
- pdev->device == PCI_DEVICE_ID_EJ188)
+ pdev->device == PCI_DEVICE_ID_EJ188) {
xhci->quirks |= XHCI_RESET_ON_RESUME;
+ xhci->quirks |= XHCI_BROKEN_STREAMS;
+ }
if (pdev->vendor == PCI_VENDOR_ID_RENESAS &&
pdev->device == 0x0014) {
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 093/267] thunderbolt: debugfs: Fix margin debugfs node creation condition
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (91 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 092/267] xhci: Apply broken streams quirk to Etron EJ188 xHCI host Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 094/267] scsi: core: Disable CDL by default Greg Kroah-Hartman
` (185 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Aapo Vienamo, Mika Westerberg
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aapo Vienamo <aapo.vienamo@linux.intel.com>
commit 985cfe501b74f214905ab4817acee0df24627268 upstream.
The margin debugfs node controls the "Enable Margin Test" field of the
lane margining operations. This field selects between either low or high
voltage margin values for voltage margin test or left or right timing
margin values for timing margin test.
According to the USB4 specification, whether or not the "Enable Margin
Test" control applies, depends on the values of the "Independent
High/Low Voltage Margin" or "Independent Left/Right Timing Margin"
capability fields for voltage and timing margin tests respectively. The
pre-existing condition enabled the debugfs node also in the case where
both low/high or left/right margins are returned, which is incorrect.
This change only enables the debugfs node in question, if the specific
required capability values are met.
Signed-off-by: Aapo Vienamo <aapo.vienamo@linux.intel.com>
Fixes: d0f1e0c2a699 ("thunderbolt: Add support for receiver lane margining")
Cc: stable@vger.kernel.org
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/thunderbolt/debugfs.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/thunderbolt/debugfs.c
+++ b/drivers/thunderbolt/debugfs.c
@@ -943,8 +943,9 @@ static void margining_port_init(struct t
debugfs_create_file("run", 0600, dir, port, &margining_run_fops);
debugfs_create_file("results", 0600, dir, port, &margining_results_fops);
debugfs_create_file("test", 0600, dir, port, &margining_test_fops);
- if (independent_voltage_margins(usb4) ||
- (supports_time(usb4) && independent_time_margins(usb4)))
+ if (independent_voltage_margins(usb4) == USB4_MARGIN_CAP_0_VOLTAGE_HL ||
+ (supports_time(usb4) &&
+ independent_time_margins(usb4) == USB4_MARGIN_CAP_1_TIME_LR))
debugfs_create_file("margin", 0600, dir, port, &margining_margin_fops);
}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 094/267] scsi: core: Disable CDL by default
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (92 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 093/267] thunderbolt: debugfs: Fix margin debugfs node creation condition Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 095/267] scsi: mpi3mr: Fix ATA NCQ priority support Greg Kroah-Hartman
` (184 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Scott McCoy, Damien Le Moal,
Niklas Cassel, Igor Pylypiv, Hannes Reinecke, Christoph Hellwig,
Martin K. Petersen
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Damien Le Moal <dlemoal@kernel.org>
commit 52912ca87e2b810e5acdcdc452593d30c9187d8f upstream.
For SCSI devices supporting the Command Duration Limits feature set, the
user can enable/disable this feature use through the sysfs device attribute
"cdl_enable". This attribute modification triggers a call to
scsi_cdl_enable() to enable and disable the feature for ATA devices and set
the scsi device cdl_enable field to the user provided bool value. For SCSI
devices supporting CDL, the feature set is always enabled and
scsi_cdl_enable() is reduced to setting the cdl_enable field.
However, for ATA devices, a drive may spin-up with the CDL feature enabled
by default. But the SCSI device cdl_enable field is always initialized to
false (CDL disabled), regardless of the actual device CDL feature
state. For ATA devices managed by libata (or libsas), libata-core always
disables the CDL feature set when the device is attached, thus syncing the
state of the CDL feature on the device and of the SCSI device cdl_enable
field. However, for ATA devices connected to a SAS HBA, the CDL feature is
not disabled on scan for ATA devices that have this feature enabled by
default, leading to an inconsistent state of the feature on the device with
the SCSI device cdl_enable field.
Avoid this inconsistency by adding a call to scsi_cdl_enable() in
scsi_cdl_check() to make sure that the device-side state of the CDL feature
set always matches the scsi device cdl_enable field state. This implies
that CDL will always be disabled for ATA devices connected to SAS HBAs,
which is consistent with libata/libsas initialization of the device.
Reported-by: Scott McCoy <scott.mccoy@wdc.com>
Fixes: 1b22cfb14142 ("scsi: core: Allow enabling and disabling command duration limits")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20240607012507.111488-1-dlemoal@kernel.org
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Igor Pylypiv <ipylypiv@google.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/scsi/scsi.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -671,6 +671,13 @@ void scsi_cdl_check(struct scsi_device *
sdev->use_10_for_rw = 0;
sdev->cdl_supported = 1;
+
+ /*
+ * If the device supports CDL, make sure that the current drive
+ * feature status is consistent with the user controlled
+ * cdl_enable state.
+ */
+ scsi_cdl_enable(sdev, sdev->cdl_enable);
} else {
sdev->cdl_supported = 0;
}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 095/267] scsi: mpi3mr: Fix ATA NCQ priority support
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (93 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 094/267] scsi: core: Disable CDL by default Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 096/267] scsi: mpt3sas: Avoid test/set_bit() operating in non-allocated memory Greg Kroah-Hartman
` (183 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Scott McCoy, Damien Le Moal,
Niklas Cassel, Martin K. Petersen
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Damien Le Moal <dlemoal@kernel.org>
commit 90e6f08915ec6efe46570420412a65050ec826b2 upstream.
The function mpi3mr_qcmd() of the mpi3mr driver is able to indicate to
the HBA if a read or write command directed at an ATA device should be
translated to an NCQ read/write command with the high prioiryt bit set
when the request uses the RT priority class and the user has enabled NCQ
priority through sysfs.
However, unlike the mpt3sas driver, the mpi3mr driver does not define
the sas_ncq_prio_supported and sas_ncq_prio_enable sysfs attributes, so
the ncq_prio_enable field of struct mpi3mr_sdev_priv_data is never
actually set and NCQ Priority cannot ever be used.
Fix this by defining these missing atributes to allow a user to check if
an ATA device supports NCQ priority and to enable/disable the use of NCQ
priority. To do this, lift the function scsih_ncq_prio_supp() out of the
mpt3sas driver and make it the generic SCSI SAS transport function
sas_ata_ncq_prio_supported(). Nothing in that function is hardware
specific, so this function can be used in both the mpt3sas driver and
the mpi3mr driver.
Reported-by: Scott McCoy <scott.mccoy@wdc.com>
Fixes: 023ab2a9b4ed ("scsi: mpi3mr: Add support for queue command processing")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20240611083435.92961-1-dlemoal@kernel.org
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/scsi/mpi3mr/mpi3mr_app.c | 62 +++++++++++++++++++++++++++++++++++
drivers/scsi/mpt3sas/mpt3sas_base.h | 3 -
drivers/scsi/mpt3sas/mpt3sas_ctl.c | 4 +-
drivers/scsi/mpt3sas/mpt3sas_scsih.c | 23 ------------
drivers/scsi/scsi_transport_sas.c | 23 ++++++++++++
include/scsi/scsi_transport_sas.h | 2 +
6 files changed, 89 insertions(+), 28 deletions(-)
--- a/drivers/scsi/mpi3mr/mpi3mr_app.c
+++ b/drivers/scsi/mpi3mr/mpi3mr_app.c
@@ -1854,10 +1854,72 @@ persistent_id_show(struct device *dev, s
}
static DEVICE_ATTR_RO(persistent_id);
+/**
+ * sas_ncq_prio_supported_show - Indicate if device supports NCQ priority
+ * @dev: pointer to embedded device
+ * @attr: sas_ncq_prio_supported attribute descriptor
+ * @buf: the buffer returned
+ *
+ * A sysfs 'read-only' sdev attribute, only works with SATA devices
+ */
+static ssize_t
+sas_ncq_prio_supported_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev = to_scsi_device(dev);
+
+ return sysfs_emit(buf, "%d\n", sas_ata_ncq_prio_supported(sdev));
+}
+static DEVICE_ATTR_RO(sas_ncq_prio_supported);
+
+/**
+ * sas_ncq_prio_enable_show - send prioritized io commands to device
+ * @dev: pointer to embedded device
+ * @attr: sas_ncq_prio_enable attribute descriptor
+ * @buf: the buffer returned
+ *
+ * A sysfs 'read/write' sdev attribute, only works with SATA devices
+ */
+static ssize_t
+sas_ncq_prio_enable_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct scsi_device *sdev = to_scsi_device(dev);
+ struct mpi3mr_sdev_priv_data *sdev_priv_data = sdev->hostdata;
+
+ if (!sdev_priv_data)
+ return 0;
+
+ return sysfs_emit(buf, "%d\n", sdev_priv_data->ncq_prio_enable);
+}
+
+static ssize_t
+sas_ncq_prio_enable_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct scsi_device *sdev = to_scsi_device(dev);
+ struct mpi3mr_sdev_priv_data *sdev_priv_data = sdev->hostdata;
+ bool ncq_prio_enable = 0;
+
+ if (kstrtobool(buf, &ncq_prio_enable))
+ return -EINVAL;
+
+ if (!sas_ata_ncq_prio_supported(sdev))
+ return -EINVAL;
+
+ sdev_priv_data->ncq_prio_enable = ncq_prio_enable;
+
+ return strlen(buf);
+}
+static DEVICE_ATTR_RW(sas_ncq_prio_enable);
+
static struct attribute *mpi3mr_dev_attrs[] = {
&dev_attr_sas_address.attr,
&dev_attr_device_handle.attr,
&dev_attr_persistent_id.attr,
+ &dev_attr_sas_ncq_prio_supported.attr,
+ &dev_attr_sas_ncq_prio_enable.attr,
NULL,
};
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -2045,9 +2045,6 @@ void
mpt3sas_setup_direct_io(struct MPT3SAS_ADAPTER *ioc, struct scsi_cmnd *scmd,
struct _raid_device *raid_device, Mpi25SCSIIORequest_t *mpi_request);
-/* NCQ Prio Handling Check */
-bool scsih_ncq_prio_supp(struct scsi_device *sdev);
-
void mpt3sas_setup_debugfs(struct MPT3SAS_ADAPTER *ioc);
void mpt3sas_destroy_debugfs(struct MPT3SAS_ADAPTER *ioc);
void mpt3sas_init_debugfs(void);
--- a/drivers/scsi/mpt3sas/mpt3sas_ctl.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_ctl.c
@@ -4034,7 +4034,7 @@ sas_ncq_prio_supported_show(struct devic
{
struct scsi_device *sdev = to_scsi_device(dev);
- return sysfs_emit(buf, "%d\n", scsih_ncq_prio_supp(sdev));
+ return sysfs_emit(buf, "%d\n", sas_ata_ncq_prio_supported(sdev));
}
static DEVICE_ATTR_RO(sas_ncq_prio_supported);
@@ -4069,7 +4069,7 @@ sas_ncq_prio_enable_store(struct device
if (kstrtobool(buf, &ncq_prio_enable))
return -EINVAL;
- if (!scsih_ncq_prio_supp(sdev))
+ if (!sas_ata_ncq_prio_supported(sdev))
return -EINVAL;
sas_device_priv_data->ncq_prio_enable = ncq_prio_enable;
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -12590,29 +12590,6 @@ scsih_pci_mmio_enabled(struct pci_dev *p
return PCI_ERS_RESULT_RECOVERED;
}
-/**
- * scsih_ncq_prio_supp - Check for NCQ command priority support
- * @sdev: scsi device struct
- *
- * This is called when a user indicates they would like to enable
- * ncq command priorities. This works only on SATA devices.
- */
-bool scsih_ncq_prio_supp(struct scsi_device *sdev)
-{
- struct scsi_vpd *vpd;
- bool ncq_prio_supp = false;
-
- rcu_read_lock();
- vpd = rcu_dereference(sdev->vpd_pg89);
- if (!vpd || vpd->len < 214)
- goto out;
-
- ncq_prio_supp = (vpd->data[213] >> 4) & 1;
-out:
- rcu_read_unlock();
-
- return ncq_prio_supp;
-}
/*
* The pci device ids are defined in mpi/mpi2_cnfg.h.
*/
--- a/drivers/scsi/scsi_transport_sas.c
+++ b/drivers/scsi/scsi_transport_sas.c
@@ -416,6 +416,29 @@ unsigned int sas_is_tlr_enabled(struct s
}
EXPORT_SYMBOL_GPL(sas_is_tlr_enabled);
+/**
+ * sas_ata_ncq_prio_supported - Check for ATA NCQ command priority support
+ * @sdev: SCSI device
+ *
+ * Check if an ATA device supports NCQ priority using VPD page 89h (ATA
+ * Information). Since this VPD page is implemented only for ATA devices,
+ * this function always returns false for SCSI devices.
+ */
+bool sas_ata_ncq_prio_supported(struct scsi_device *sdev)
+{
+ struct scsi_vpd *vpd;
+ bool ncq_prio_supported = false;
+
+ rcu_read_lock();
+ vpd = rcu_dereference(sdev->vpd_pg89);
+ if (vpd && vpd->len >= 214)
+ ncq_prio_supported = (vpd->data[213] >> 4) & 1;
+ rcu_read_unlock();
+
+ return ncq_prio_supported;
+}
+EXPORT_SYMBOL_GPL(sas_ata_ncq_prio_supported);
+
/*
* SAS Phy attributes
*/
--- a/include/scsi/scsi_transport_sas.h
+++ b/include/scsi/scsi_transport_sas.h
@@ -200,6 +200,8 @@ unsigned int sas_is_tlr_enabled(struct s
void sas_disable_tlr(struct scsi_device *);
void sas_enable_tlr(struct scsi_device *);
+bool sas_ata_ncq_prio_supported(struct scsi_device *sdev);
+
extern struct sas_rphy *sas_end_device_alloc(struct sas_port *);
extern struct sas_rphy *sas_expander_alloc(struct sas_port *, enum sas_device_type);
void sas_rphy_free(struct sas_rphy *);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 096/267] scsi: mpt3sas: Avoid test/set_bit() operating in non-allocated memory
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (94 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 095/267] scsi: mpi3mr: Fix ATA NCQ priority support Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 097/267] scsi: sd: Use READ(16) when reading block zero on large capacity disks Greg Kroah-Hartman
` (182 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Keith Busch, Breno Leitao,
Martin K. Petersen
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Breno Leitao <leitao@debian.org>
commit 4254dfeda82f20844299dca6c38cbffcfd499f41 upstream.
There is a potential out-of-bounds access when using test_bit() on a single
word. The test_bit() and set_bit() functions operate on long values, and
when testing or setting a single word, they can exceed the word
boundary. KASAN detects this issue and produces a dump:
BUG: KASAN: slab-out-of-bounds in _scsih_add_device.constprop.0 (./arch/x86/include/asm/bitops.h:60 ./include/asm-generic/bitops/instrumented-atomic.h:29 drivers/scsi/mpt3sas/mpt3sas_scsih.c:7331) mpt3sas
Write of size 8 at addr ffff8881d26e3c60 by task kworker/u1536:2/2965
For full log, please look at [1].
Make the allocation at least the size of sizeof(unsigned long) so that
set_bit() and test_bit() have sufficient room for read/write operations
without overwriting unallocated memory.
[1] Link: https://lore.kernel.org/all/ZkNcALr3W3KGYYJG@gmail.com/
Fixes: c696f7b83ede ("scsi: mpt3sas: Implement device_remove_in_progress check in IOCTL path")
Cc: stable@vger.kernel.org
Suggested-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240605085530.499432-1-leitao@debian.org
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/scsi/mpt3sas/mpt3sas_base.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -8486,6 +8486,12 @@ mpt3sas_base_attach(struct MPT3SAS_ADAPT
ioc->pd_handles_sz = (ioc->facts.MaxDevHandle / 8);
if (ioc->facts.MaxDevHandle % 8)
ioc->pd_handles_sz++;
+ /*
+ * pd_handles_sz should have, at least, the minimal room for
+ * set_bit()/test_bit(), otherwise out-of-memory touch may occur.
+ */
+ ioc->pd_handles_sz = ALIGN(ioc->pd_handles_sz, sizeof(unsigned long));
+
ioc->pd_handles = kzalloc(ioc->pd_handles_sz,
GFP_KERNEL);
if (!ioc->pd_handles) {
@@ -8503,6 +8509,13 @@ mpt3sas_base_attach(struct MPT3SAS_ADAPT
ioc->pend_os_device_add_sz = (ioc->facts.MaxDevHandle / 8);
if (ioc->facts.MaxDevHandle % 8)
ioc->pend_os_device_add_sz++;
+
+ /*
+ * pend_os_device_add_sz should have, at least, the minimal room for
+ * set_bit()/test_bit(), otherwise out-of-memory may occur.
+ */
+ ioc->pend_os_device_add_sz = ALIGN(ioc->pend_os_device_add_sz,
+ sizeof(unsigned long));
ioc->pend_os_device_add = kzalloc(ioc->pend_os_device_add_sz,
GFP_KERNEL);
if (!ioc->pend_os_device_add) {
@@ -8794,6 +8807,12 @@ _base_check_ioc_facts_changes(struct MPT
if (ioc->facts.MaxDevHandle % 8)
pd_handles_sz++;
+ /*
+ * pd_handles should have, at least, the minimal room for
+ * set_bit()/test_bit(), otherwise out-of-memory touch may
+ * occur.
+ */
+ pd_handles_sz = ALIGN(pd_handles_sz, sizeof(unsigned long));
pd_handles = krealloc(ioc->pd_handles, pd_handles_sz,
GFP_KERNEL);
if (!pd_handles) {
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 097/267] scsi: sd: Use READ(16) when reading block zero on large capacity disks
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (95 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 096/267] scsi: mpt3sas: Avoid test/set_bit() operating in non-allocated memory Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 098/267] gve: Clear napi->skb before dev_kfree_skb_any() Greg Kroah-Hartman
` (181 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Pierre Tomon, Alan Stern,
Bart Van Assche, Martin K. Petersen
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Martin K. Petersen <martin.petersen@oracle.com>
commit 7926d51f73e0434a6250c2fd1a0555f98d9a62da upstream.
Commit 321da3dc1f3c ("scsi: sd: usb_storage: uas: Access media prior
to querying device properties") triggered a read to LBA 0 before
attempting to inquire about device characteristics. This was done
because some protocol bridge devices will return generic values until
an attached storage device's media has been accessed.
Pierre Tomon reported that this change caused problems on a large
capacity external drive connected via a bridge device. The bridge in
question does not appear to implement the READ(10) command.
Issue a READ(16) instead of READ(10) when a device has been identified
as preferring 16-byte commands (use_16_for_rw heuristic).
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218890
Link: https://lore.kernel.org/r/70dd7ae0-b6b1-48e1-bb59-53b7c7f18274@rowland.harvard.edu
Link: https://lore.kernel.org/r/20240605022521.3960956-1-martin.petersen@oracle.com
Fixes: 321da3dc1f3c ("scsi: sd: usb_storage: uas: Access media prior to querying device properties")
Cc: stable@vger.kernel.org
Reported-by: Pierre Tomon <pierretom+12@ik.me>
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Pierre Tomon <pierretom+12@ik.me>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/scsi/sd.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3406,16 +3406,23 @@ static bool sd_validate_opt_xfer_size(st
static void sd_read_block_zero(struct scsi_disk *sdkp)
{
- unsigned int buf_len = sdkp->device->sector_size;
- char *buffer, cmd[10] = { };
+ struct scsi_device *sdev = sdkp->device;
+ unsigned int buf_len = sdev->sector_size;
+ u8 *buffer, cmd[16] = { };
buffer = kmalloc(buf_len, GFP_KERNEL);
if (!buffer)
return;
- cmd[0] = READ_10;
- put_unaligned_be32(0, &cmd[2]); /* Logical block address 0 */
- put_unaligned_be16(1, &cmd[7]); /* Transfer 1 logical block */
+ if (sdev->use_16_for_rw) {
+ cmd[0] = READ_16;
+ put_unaligned_be64(0, &cmd[2]); /* Logical block address 0 */
+ put_unaligned_be32(1, &cmd[10]);/* Transfer 1 logical block */
+ } else {
+ cmd[0] = READ_10;
+ put_unaligned_be32(0, &cmd[2]); /* Logical block address 0 */
+ put_unaligned_be16(1, &cmd[7]); /* Transfer 1 logical block */
+ }
scsi_execute_cmd(sdkp->device, cmd, REQ_OP_DRV_IN, buffer, buf_len,
SD_TIMEOUT, sdkp->max_retries, NULL);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 098/267] gve: Clear napi->skb before dev_kfree_skb_any()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (96 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 097/267] scsi: sd: Use READ(16) when reading block zero on large capacity disks Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 099/267] powerpc/uaccess: Fix build errors seen with GCC 13/14 Greg Kroah-Hartman
` (180 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Shailend Chand, Ziwei Xiao,
Harshitha Ramamurthy, Praveen Kaligineedi, Jakub Kicinski
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ziwei Xiao <ziweixiao@google.com>
commit 6f4d93b78ade0a4c2cafd587f7b429ce95abb02e upstream.
gve_rx_free_skb incorrectly leaves napi->skb referencing an skb after it
is freed with dev_kfree_skb_any(). This can result in a subsequent call
to napi_get_frags returning a dangling pointer.
Fix this by clearing napi->skb before the skb is freed.
Fixes: 9b8dd5e5ea48 ("gve: DQO: Add RX path")
Cc: stable@vger.kernel.org
Reported-by: Shailend Chand <shailend@google.com>
Signed-off-by: Ziwei Xiao <ziweixiao@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Reviewed-by: Shailend Chand <shailend@google.com>
Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Link: https://lore.kernel.org/r/20240612001654.923887-1-ziweixiao@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/net/ethernet/google/gve/gve_rx_dqo.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
--- a/drivers/net/ethernet/google/gve/gve_rx_dqo.c
+++ b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
@@ -506,11 +506,13 @@ static void gve_rx_skb_hash(struct sk_bu
skb_set_hash(skb, le32_to_cpu(compl_desc->hash), hash_type);
}
-static void gve_rx_free_skb(struct gve_rx_ring *rx)
+static void gve_rx_free_skb(struct napi_struct *napi, struct gve_rx_ring *rx)
{
if (!rx->ctx.skb_head)
return;
+ if (rx->ctx.skb_head == napi->skb)
+ napi->skb = NULL;
dev_kfree_skb_any(rx->ctx.skb_head);
rx->ctx.skb_head = NULL;
rx->ctx.skb_tail = NULL;
@@ -783,7 +785,7 @@ int gve_rx_poll_dqo(struct gve_notify_bl
err = gve_rx_dqo(napi, rx, compl_desc, rx->q_num);
if (err < 0) {
- gve_rx_free_skb(rx);
+ gve_rx_free_skb(napi, rx);
u64_stats_update_begin(&rx->statss);
if (err == -ENOMEM)
rx->rx_skb_alloc_fail++;
@@ -826,7 +828,7 @@ int gve_rx_poll_dqo(struct gve_notify_bl
/* gve_rx_complete_skb() will consume skb if successful */
if (gve_rx_complete_skb(rx, napi, compl_desc, feat) != 0) {
- gve_rx_free_skb(rx);
+ gve_rx_free_skb(napi, rx);
u64_stats_update_begin(&rx->statss);
rx->rx_desc_err_dropped_pkt++;
u64_stats_update_end(&rx->statss);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 099/267] powerpc/uaccess: Fix build errors seen with GCC 13/14
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (97 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 098/267] gve: Clear napi->skb before dev_kfree_skb_any() Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 100/267] HID: nvidia-shield: Add missing check for input_ff_create_memless Greg Kroah-Hartman
` (179 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Kewen Lin, Michael Ellerman
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Ellerman <mpe@ellerman.id.au>
commit 2d43cc701b96f910f50915ac4c2a0cae5deb734c upstream.
Building ppc64le_defconfig with GCC 14 fails with assembler errors:
CC fs/readdir.o
/tmp/ccdQn0mD.s: Assembler messages:
/tmp/ccdQn0mD.s:212: Error: operand out of domain (18 is not a multiple of 4)
/tmp/ccdQn0mD.s:226: Error: operand out of domain (18 is not a multiple of 4)
... [6 lines]
/tmp/ccdQn0mD.s:1699: Error: operand out of domain (18 is not a multiple of 4)
A snippet of the asm shows:
# ../fs/readdir.c:210: unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
ld 9,0(29) # MEM[(u64 *)name_38(D) + _88 * 1], MEM[(u64 *)name_38(D) + _88 * 1]
# 210 "../fs/readdir.c" 1
1: std 9,18(8) # put_user # *__pus_addr_52, MEM[(u64 *)name_38(D) + _88 * 1]
The 'std' instruction requires a 4-byte aligned displacement because
it is a DS-form instruction, and as the assembler says, 18 is not a
multiple of 4.
A similar error is seen with GCC 13 and CONFIG_UBSAN_SIGNED_WRAP=y.
The fix is to change the constraint on the memory operand to put_user(),
from "m" which is a general memory reference to "YZ".
The "Z" constraint is documented in the GCC manual PowerPC machine
constraints, and specifies a "memory operand accessed with indexed or
indirect addressing". "Y" is not documented in the manual but specifies
a "memory operand for a DS-form instruction". Using both allows the
compiler to generate a DS-form "std" or X-form "stdx" as appropriate.
The change has to be conditional on CONFIG_PPC_KERNEL_PREFIXED because
the "Y" constraint does not guarantee 4-byte alignment when prefixed
instructions are enabled.
Unfortunately clang doesn't support the "Y" constraint so that has to be
behind an ifdef.
Although the build error is only seen with GCC 13/14, that appears
to just be luck. The constraint has been incorrect since it was first
added.
Fixes: c20beffeec3c ("powerpc/uaccess: Use flexible addressing with __put_user()/__get_user()")
Cc: stable@vger.kernel.org # v5.10+
Suggested-by: Kewen Lin <linkw@gcc.gnu.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240529123029.146953-1-mpe@ellerman.id.au
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/powerpc/include/asm/uaccess.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -92,9 +92,25 @@ __pu_failed: \
: label)
#endif
+#ifdef CONFIG_CC_IS_CLANG
+#define DS_FORM_CONSTRAINT "Z<>"
+#else
+#define DS_FORM_CONSTRAINT "YZ<>"
+#endif
+
#ifdef __powerpc64__
+#ifdef CONFIG_PPC_KERNEL_PREFIXED
#define __put_user_asm2_goto(x, ptr, label) \
__put_user_asm_goto(x, ptr, label, "std")
+#else
+#define __put_user_asm2_goto(x, addr, label) \
+ asm goto ("1: std%U1%X1 %0,%1 # put_user\n" \
+ EX_TABLE(1b, %l2) \
+ : \
+ : "r" (x), DS_FORM_CONSTRAINT (*addr) \
+ : \
+ : label)
+#endif // CONFIG_PPC_KERNEL_PREFIXED
#else /* __powerpc64__ */
#define __put_user_asm2_goto(x, addr, label) \
asm goto( \
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 100/267] HID: nvidia-shield: Add missing check for input_ff_create_memless
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (98 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 099/267] powerpc/uaccess: Fix build errors seen with GCC 13/14 Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 101/267] cxl/test: Add missing vmalloc.h for tools/testing/cxl/test/mem.c Greg Kroah-Hartman
` (178 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Chen Ni, Rahul Rameshbabu,
Jiri Kosina, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chen Ni <nichen@iscas.ac.cn>
[ Upstream commit 0a3f9f7fc59feb8a91a2793b8b60977895c72365 ]
Add check for the return value of input_ff_create_memless() and return
the error if it fails in order to catch the error.
Fixes: 09308562d4af ("HID: nvidia-shield: Initial driver implementation with Thunderstrike support")
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/hid/hid-nvidia-shield.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/hid/hid-nvidia-shield.c b/drivers/hid/hid-nvidia-shield.c
index edd0b0f1193bd..97dfa3694ff04 100644
--- a/drivers/hid/hid-nvidia-shield.c
+++ b/drivers/hid/hid-nvidia-shield.c
@@ -283,7 +283,9 @@ static struct input_dev *shield_haptics_create(
return haptics;
input_set_capability(haptics, EV_FF, FF_RUMBLE);
- input_ff_create_memless(haptics, NULL, play_effect);
+ ret = input_ff_create_memless(haptics, NULL, play_effect);
+ if (ret)
+ goto err;
ret = input_register_device(haptics);
if (ret)
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 101/267] cxl/test: Add missing vmalloc.h for tools/testing/cxl/test/mem.c
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (99 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 100/267] HID: nvidia-shield: Add missing check for input_ff_create_memless Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 102/267] cxl/region: Fix memregion leaks in devm_cxl_add_region() Greg Kroah-Hartman
` (177 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Dan Williams, Alison Schofield,
Dave Jiang, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dave Jiang <dave.jiang@intel.com>
[ Upstream commit d55510527153d17a3af8cc2df69c04f95ae1350d ]
tools/testing/cxl/test/mem.c uses vmalloc() and vfree() but does not
include linux/vmalloc.h. Kernel v6.10 made changes that causes the
currently included headers not depend on vmalloc.h and therefore
mem.c can no longer compile. Add linux/vmalloc.h to fix compile
issue.
CC [M] tools/testing/cxl/test/mem.o
tools/testing/cxl/test/mem.c: In function ‘label_area_release’:
tools/testing/cxl/test/mem.c:1428:9: error: implicit declaration of function ‘vfree’; did you mean ‘kvfree’? [-Werror=implicit-function-declaration]
1428 | vfree(lsa);
| ^~~~~
| kvfree
tools/testing/cxl/test/mem.c: In function ‘cxl_mock_mem_probe’:
tools/testing/cxl/test/mem.c:1466:22: error: implicit declaration of function ‘vmalloc’; did you mean ‘kmalloc’? [-Werror=implicit-function-declaration]
1466 | mdata->lsa = vmalloc(LSA_SIZE);
| ^~~~~~~
| kmalloc
Fixes: 7d3eb23c4ccf ("tools/testing/cxl: Introduce a mock memory device + driver")
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://lore.kernel.org/r/20240528225551.1025977-1-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
tools/testing/cxl/test/mem.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index 68118c37f0b56..0ed100617d993 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -3,6 +3,7 @@
#include <linux/platform_device.h>
#include <linux/mod_devicetable.h>
+#include <linux/vmalloc.h>
#include <linux/module.h>
#include <linux/delay.h>
#include <linux/sizes.h>
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 102/267] cxl/region: Fix memregion leaks in devm_cxl_add_region()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (100 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 101/267] cxl/test: Add missing vmalloc.h for tools/testing/cxl/test/mem.c Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 103/267] cachefiles: add output string to cachefiles_obj_[get|put]_ondemand_fd Greg Kroah-Hartman
` (176 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Li Zhijian, Dan Williams, Dave Jiang,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Li Zhijian <lizhijian@fujitsu.com>
[ Upstream commit 49ba7b515c4c0719b866d16f068e62d16a8a3dd1 ]
Move the mode verification to __create_region() before allocating the
memregion to avoid the memregion leaks.
Fixes: 6e099264185d ("cxl/region: Add volatile region creation support")
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/20240507053421.456439-1-lizhijian@fujitsu.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/cxl/core/region.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index c65ab42546238..7a646fed17211 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2186,15 +2186,6 @@ static struct cxl_region *devm_cxl_add_region(struct cxl_root_decoder *cxlrd,
struct device *dev;
int rc;
- switch (mode) {
- case CXL_DECODER_RAM:
- case CXL_DECODER_PMEM:
- break;
- default:
- dev_err(&cxlrd->cxlsd.cxld.dev, "unsupported mode %d\n", mode);
- return ERR_PTR(-EINVAL);
- }
-
cxlr = cxl_region_alloc(cxlrd, id);
if (IS_ERR(cxlr))
return cxlr;
@@ -2245,6 +2236,15 @@ static struct cxl_region *__create_region(struct cxl_root_decoder *cxlrd,
{
int rc;
+ switch (mode) {
+ case CXL_DECODER_RAM:
+ case CXL_DECODER_PMEM:
+ break;
+ default:
+ dev_err(&cxlrd->cxlsd.cxld.dev, "unsupported mode %d\n", mode);
+ return ERR_PTR(-EINVAL);
+ }
+
rc = memregion_alloc(GFP_KERNEL);
if (rc < 0)
return ERR_PTR(rc);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 103/267] cachefiles: add output string to cachefiles_obj_[get|put]_ondemand_fd
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (101 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 102/267] cxl/region: Fix memregion leaks in devm_cxl_add_region() Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 104/267] cachefiles: remove requests from xarray during flushing requests Greg Kroah-Hartman
` (175 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Baokun Li, Jeff Layton, Jingbo Xu,
Christian Brauner, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li <libaokun1@huawei.com>
[ Upstream commit cc5ac966f26193ab185cc43d64d9f1ae998ccb6e ]
This lets us see the correct trace output.
Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie")
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Link: https://lore.kernel.org/r/20240522114308.2402121-2-libaokun@huaweicloud.com
Acked-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
include/trace/events/cachefiles.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h
index cf4b98b9a9edc..e3213af847cdf 100644
--- a/include/trace/events/cachefiles.h
+++ b/include/trace/events/cachefiles.h
@@ -127,7 +127,9 @@ enum cachefiles_error_trace {
EM(cachefiles_obj_see_lookup_cookie, "SEE lookup_cookie") \
EM(cachefiles_obj_see_lookup_failed, "SEE lookup_failed") \
EM(cachefiles_obj_see_withdraw_cookie, "SEE withdraw_cookie") \
- E_(cachefiles_obj_see_withdrawal, "SEE withdrawal")
+ EM(cachefiles_obj_see_withdrawal, "SEE withdrawal") \
+ EM(cachefiles_obj_get_ondemand_fd, "GET ondemand_fd") \
+ E_(cachefiles_obj_put_ondemand_fd, "PUT ondemand_fd")
#define cachefiles_coherency_traces \
EM(cachefiles_coherency_check_aux, "BAD aux ") \
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 104/267] cachefiles: remove requests from xarray during flushing requests
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (102 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 103/267] cachefiles: add output string to cachefiles_obj_[get|put]_ondemand_fd Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 105/267] cachefiles: introduce object ondemand state Greg Kroah-Hartman
` (174 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Baokun Li, Jeff Layton, Jia Zhu,
Gao Xiang, Jingbo Xu, Christian Brauner, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li <libaokun1@huawei.com>
[ Upstream commit 0fc75c5940fa634d84e64c93bfc388e1274ed013 ]
Even with CACHEFILES_DEAD set, we can still read the requests, so in the
following concurrency the request may be used after it has been freed:
mount | daemon_thread1 | daemon_thread2
------------------------------------------------------------
cachefiles_ondemand_init_object
cachefiles_ondemand_send_req
REQ_A = kzalloc(sizeof(*req) + data_len)
wait_for_completion(&REQ_A->done)
cachefiles_daemon_read
cachefiles_ondemand_daemon_read
// close dev fd
cachefiles_flush_reqs
complete(&REQ_A->done)
kfree(REQ_A)
xa_lock(&cache->reqs);
cachefiles_ondemand_select_req
req->msg.opcode != CACHEFILES_OP_READ
// req use-after-free !!!
xa_unlock(&cache->reqs);
xa_destroy(&cache->reqs)
Hence remove requests from cache->reqs when flushing them to avoid
accessing freed requests.
Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie")
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Link: https://lore.kernel.org/r/20240522114308.2402121-3-libaokun@huaweicloud.com
Acked-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/cachefiles/daemon.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
index 5f4df9588620f..7d1f456e376dd 100644
--- a/fs/cachefiles/daemon.c
+++ b/fs/cachefiles/daemon.c
@@ -158,6 +158,7 @@ static void cachefiles_flush_reqs(struct cachefiles_cache *cache)
xa_for_each(xa, index, req) {
req->error = -EIO;
complete(&req->done);
+ __xa_erase(xa, index);
}
xa_unlock(xa);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 105/267] cachefiles: introduce object ondemand state
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (103 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 104/267] cachefiles: remove requests from xarray during flushing requests Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 106/267] cachefiles: extract ondemand info field from cachefiles_object Greg Kroah-Hartman
` (173 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Jia Zhu, Jingbo Xu, David Howells,
Christian Brauner, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jia Zhu <zhujia.zj@bytedance.com>
[ Upstream commit 357a18d033143617e9c7d420c8f0dd4cbab5f34d ]
Previously, @ondemand_id field was used not only to identify ondemand
state of the object, but also to represent the index of the xarray.
This commit introduces @state field to decouple the role of @ondemand_id
and adds helpers to access it.
Signed-off-by: Jia Zhu <zhujia.zj@bytedance.com>
Link: https://lore.kernel.org/r/20231120041422.75170-2-zhujia.zj@bytedance.com
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Stable-dep-of: 0a790040838c ("cachefiles: add spin_lock for cachefiles_ondemand_info")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/cachefiles/internal.h | 21 +++++++++++++++++++++
fs/cachefiles/ondemand.c | 21 +++++++++------------
2 files changed, 30 insertions(+), 12 deletions(-)
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 2ad58c4652084..00beedeaec183 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -44,6 +44,11 @@ struct cachefiles_volume {
struct dentry *fanout[256]; /* Fanout subdirs */
};
+enum cachefiles_object_state {
+ CACHEFILES_ONDEMAND_OBJSTATE_CLOSE, /* Anonymous fd closed by daemon or initial state */
+ CACHEFILES_ONDEMAND_OBJSTATE_OPEN, /* Anonymous fd associated with object is available */
+};
+
/*
* Backing file state.
*/
@@ -62,6 +67,7 @@ struct cachefiles_object {
#define CACHEFILES_OBJECT_USING_TMPFILE 0 /* Have an unlinked tmpfile */
#ifdef CONFIG_CACHEFILES_ONDEMAND
int ondemand_id;
+ enum cachefiles_object_state state;
#endif
};
@@ -296,6 +302,21 @@ extern void cachefiles_ondemand_clean_object(struct cachefiles_object *object);
extern int cachefiles_ondemand_read(struct cachefiles_object *object,
loff_t pos, size_t len);
+#define CACHEFILES_OBJECT_STATE_FUNCS(_state, _STATE) \
+static inline bool \
+cachefiles_ondemand_object_is_##_state(const struct cachefiles_object *object) \
+{ \
+ return object->state == CACHEFILES_ONDEMAND_OBJSTATE_##_STATE; \
+} \
+ \
+static inline void \
+cachefiles_ondemand_set_object_##_state(struct cachefiles_object *object) \
+{ \
+ object->state = CACHEFILES_ONDEMAND_OBJSTATE_##_STATE; \
+}
+
+CACHEFILES_OBJECT_STATE_FUNCS(open, OPEN);
+CACHEFILES_OBJECT_STATE_FUNCS(close, CLOSE);
#else
static inline ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
char __user *_buffer, size_t buflen)
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
index 0254ed39f68ce..90456b8a4b3e0 100644
--- a/fs/cachefiles/ondemand.c
+++ b/fs/cachefiles/ondemand.c
@@ -15,6 +15,7 @@ static int cachefiles_ondemand_fd_release(struct inode *inode,
xa_lock(&cache->reqs);
object->ondemand_id = CACHEFILES_ONDEMAND_ID_CLOSED;
+ cachefiles_ondemand_set_object_close(object);
/*
* Flush all pending READ requests since their completion depends on
@@ -176,6 +177,8 @@ int cachefiles_ondemand_copen(struct cachefiles_cache *cache, char *args)
set_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags);
trace_cachefiles_ondemand_copen(req->object, id, size);
+ cachefiles_ondemand_set_object_open(req->object);
+
out:
complete(&req->done);
return ret;
@@ -363,7 +366,8 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object,
/* coupled with the barrier in cachefiles_flush_reqs() */
smp_mb();
- if (opcode != CACHEFILES_OP_OPEN && object->ondemand_id <= 0) {
+ if (opcode != CACHEFILES_OP_OPEN &&
+ !cachefiles_ondemand_object_is_open(object)) {
WARN_ON_ONCE(object->ondemand_id == 0);
xas_unlock(&xas);
ret = -EIO;
@@ -430,18 +434,11 @@ static int cachefiles_ondemand_init_close_req(struct cachefiles_req *req,
void *private)
{
struct cachefiles_object *object = req->object;
- int object_id = object->ondemand_id;
- /*
- * It's possible that object id is still 0 if the cookie looking up
- * phase failed before OPEN request has ever been sent. Also avoid
- * sending CLOSE request for CACHEFILES_ONDEMAND_ID_CLOSED, which means
- * anon_fd has already been closed.
- */
- if (object_id <= 0)
+ if (!cachefiles_ondemand_object_is_open(object))
return -ENOENT;
- req->msg.object_id = object_id;
+ req->msg.object_id = object->ondemand_id;
trace_cachefiles_ondemand_close(object, &req->msg);
return 0;
}
@@ -460,7 +457,7 @@ static int cachefiles_ondemand_init_read_req(struct cachefiles_req *req,
int object_id = object->ondemand_id;
/* Stop enqueuing requests when daemon has closed anon_fd. */
- if (object_id <= 0) {
+ if (!cachefiles_ondemand_object_is_open(object)) {
WARN_ON_ONCE(object_id == 0);
pr_info_once("READ: anonymous fd closed prematurely.\n");
return -EIO;
@@ -485,7 +482,7 @@ int cachefiles_ondemand_init_object(struct cachefiles_object *object)
* creating a new tmpfile as the cache file. Reuse the previously
* allocated object ID if any.
*/
- if (object->ondemand_id > 0)
+ if (cachefiles_ondemand_object_is_open(object))
return 0;
volume_key_size = volume->key[0] + 1;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 106/267] cachefiles: extract ondemand info field from cachefiles_object
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (104 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 105/267] cachefiles: introduce object ondemand state Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 107/267] cachefiles: resend an open request if the read requests object is closed Greg Kroah-Hartman
` (172 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Jia Zhu, Jingbo Xu, David Howells,
Christian Brauner, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jia Zhu <zhujia.zj@bytedance.com>
[ Upstream commit 3c5ecfe16e7699011c12c2d44e55437415331fa3 ]
We'll introduce a @work_struct field for @object in subsequent patches,
it will enlarge the size of @object.
As the result of that, this commit extracts ondemand info field from
@object.
Signed-off-by: Jia Zhu <zhujia.zj@bytedance.com>
Link: https://lore.kernel.org/r/20231120041422.75170-3-zhujia.zj@bytedance.com
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Stable-dep-of: 0a790040838c ("cachefiles: add spin_lock for cachefiles_ondemand_info")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/cachefiles/interface.c | 7 ++++++-
fs/cachefiles/internal.h | 26 ++++++++++++++++++++++----
fs/cachefiles/ondemand.c | 34 ++++++++++++++++++++++++++++------
3 files changed, 56 insertions(+), 11 deletions(-)
diff --git a/fs/cachefiles/interface.c b/fs/cachefiles/interface.c
index 40052bdb33655..35ba2117a6f65 100644
--- a/fs/cachefiles/interface.c
+++ b/fs/cachefiles/interface.c
@@ -31,6 +31,11 @@ struct cachefiles_object *cachefiles_alloc_object(struct fscache_cookie *cookie)
if (!object)
return NULL;
+ if (cachefiles_ondemand_init_obj_info(object, volume)) {
+ kmem_cache_free(cachefiles_object_jar, object);
+ return NULL;
+ }
+
refcount_set(&object->ref, 1);
spin_lock_init(&object->lock);
@@ -88,7 +93,7 @@ void cachefiles_put_object(struct cachefiles_object *object,
ASSERTCMP(object->file, ==, NULL);
kfree(object->d_name);
-
+ cachefiles_ondemand_deinit_obj_info(object);
cache = object->volume->cache->cache;
fscache_put_cookie(object->cookie, fscache_cookie_put_object);
object->cookie = NULL;
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 00beedeaec183..b0fe76964bc0d 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -49,6 +49,12 @@ enum cachefiles_object_state {
CACHEFILES_ONDEMAND_OBJSTATE_OPEN, /* Anonymous fd associated with object is available */
};
+struct cachefiles_ondemand_info {
+ int ondemand_id;
+ enum cachefiles_object_state state;
+ struct cachefiles_object *object;
+};
+
/*
* Backing file state.
*/
@@ -66,8 +72,7 @@ struct cachefiles_object {
unsigned long flags;
#define CACHEFILES_OBJECT_USING_TMPFILE 0 /* Have an unlinked tmpfile */
#ifdef CONFIG_CACHEFILES_ONDEMAND
- int ondemand_id;
- enum cachefiles_object_state state;
+ struct cachefiles_ondemand_info *ondemand;
#endif
};
@@ -302,17 +307,21 @@ extern void cachefiles_ondemand_clean_object(struct cachefiles_object *object);
extern int cachefiles_ondemand_read(struct cachefiles_object *object,
loff_t pos, size_t len);
+extern int cachefiles_ondemand_init_obj_info(struct cachefiles_object *obj,
+ struct cachefiles_volume *volume);
+extern void cachefiles_ondemand_deinit_obj_info(struct cachefiles_object *obj);
+
#define CACHEFILES_OBJECT_STATE_FUNCS(_state, _STATE) \
static inline bool \
cachefiles_ondemand_object_is_##_state(const struct cachefiles_object *object) \
{ \
- return object->state == CACHEFILES_ONDEMAND_OBJSTATE_##_STATE; \
+ return object->ondemand->state == CACHEFILES_ONDEMAND_OBJSTATE_##_STATE; \
} \
\
static inline void \
cachefiles_ondemand_set_object_##_state(struct cachefiles_object *object) \
{ \
- object->state = CACHEFILES_ONDEMAND_OBJSTATE_##_STATE; \
+ object->ondemand->state = CACHEFILES_ONDEMAND_OBJSTATE_##_STATE; \
}
CACHEFILES_OBJECT_STATE_FUNCS(open, OPEN);
@@ -338,6 +347,15 @@ static inline int cachefiles_ondemand_read(struct cachefiles_object *object,
{
return -EOPNOTSUPP;
}
+
+static inline int cachefiles_ondemand_init_obj_info(struct cachefiles_object *obj,
+ struct cachefiles_volume *volume)
+{
+ return 0;
+}
+static inline void cachefiles_ondemand_deinit_obj_info(struct cachefiles_object *obj)
+{
+}
#endif
/*
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
index 90456b8a4b3e0..deb7e3007aa1d 100644
--- a/fs/cachefiles/ondemand.c
+++ b/fs/cachefiles/ondemand.c
@@ -9,12 +9,13 @@ static int cachefiles_ondemand_fd_release(struct inode *inode,
{
struct cachefiles_object *object = file->private_data;
struct cachefiles_cache *cache = object->volume->cache;
- int object_id = object->ondemand_id;
+ struct cachefiles_ondemand_info *info = object->ondemand;
+ int object_id = info->ondemand_id;
struct cachefiles_req *req;
XA_STATE(xas, &cache->reqs, 0);
xa_lock(&cache->reqs);
- object->ondemand_id = CACHEFILES_ONDEMAND_ID_CLOSED;
+ info->ondemand_id = CACHEFILES_ONDEMAND_ID_CLOSED;
cachefiles_ondemand_set_object_close(object);
/*
@@ -222,7 +223,7 @@ static int cachefiles_ondemand_get_fd(struct cachefiles_req *req)
load = (void *)req->msg.data;
load->fd = fd;
req->msg.object_id = object_id;
- object->ondemand_id = object_id;
+ object->ondemand->ondemand_id = object_id;
cachefiles_get_unbind_pincount(cache);
trace_cachefiles_ondemand_open(object, &req->msg, load);
@@ -368,7 +369,7 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object,
if (opcode != CACHEFILES_OP_OPEN &&
!cachefiles_ondemand_object_is_open(object)) {
- WARN_ON_ONCE(object->ondemand_id == 0);
+ WARN_ON_ONCE(object->ondemand->ondemand_id == 0);
xas_unlock(&xas);
ret = -EIO;
goto out;
@@ -438,7 +439,7 @@ static int cachefiles_ondemand_init_close_req(struct cachefiles_req *req,
if (!cachefiles_ondemand_object_is_open(object))
return -ENOENT;
- req->msg.object_id = object->ondemand_id;
+ req->msg.object_id = object->ondemand->ondemand_id;
trace_cachefiles_ondemand_close(object, &req->msg);
return 0;
}
@@ -454,7 +455,7 @@ static int cachefiles_ondemand_init_read_req(struct cachefiles_req *req,
struct cachefiles_object *object = req->object;
struct cachefiles_read *load = (void *)req->msg.data;
struct cachefiles_read_ctx *read_ctx = private;
- int object_id = object->ondemand_id;
+ int object_id = object->ondemand->ondemand_id;
/* Stop enqueuing requests when daemon has closed anon_fd. */
if (!cachefiles_ondemand_object_is_open(object)) {
@@ -500,6 +501,27 @@ void cachefiles_ondemand_clean_object(struct cachefiles_object *object)
cachefiles_ondemand_init_close_req, NULL);
}
+int cachefiles_ondemand_init_obj_info(struct cachefiles_object *object,
+ struct cachefiles_volume *volume)
+{
+ if (!cachefiles_in_ondemand_mode(volume->cache))
+ return 0;
+
+ object->ondemand = kzalloc(sizeof(struct cachefiles_ondemand_info),
+ GFP_KERNEL);
+ if (!object->ondemand)
+ return -ENOMEM;
+
+ object->ondemand->object = object;
+ return 0;
+}
+
+void cachefiles_ondemand_deinit_obj_info(struct cachefiles_object *object)
+{
+ kfree(object->ondemand);
+ object->ondemand = NULL;
+}
+
int cachefiles_ondemand_read(struct cachefiles_object *object,
loff_t pos, size_t len)
{
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 107/267] cachefiles: resend an open request if the read requests object is closed
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (105 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 106/267] cachefiles: extract ondemand info field from cachefiles_object Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 108/267] cachefiles: add spin_lock for cachefiles_ondemand_info Greg Kroah-Hartman
` (171 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Jia Zhu, Jingbo Xu, David Howells,
Christian Brauner, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jia Zhu <zhujia.zj@bytedance.com>
[ Upstream commit 0a7e54c1959c0feb2de23397ec09c7692364313e ]
When an anonymous fd is closed by user daemon, if there is a new read
request for this file comes up, the anonymous fd should be re-opened
to handle that read request rather than fail it directly.
1. Introduce reopening state for objects that are closed but have
inflight/subsequent read requests.
2. No longer flush READ requests but only CLOSE requests when anonymous
fd is closed.
3. Enqueue the reopen work to workqueue, thus user daemon could get rid
of daemon_read context and handle that request smoothly. Otherwise,
the user daemon will send a reopen request and wait for itself to
process the request.
Signed-off-by: Jia Zhu <zhujia.zj@bytedance.com>
Link: https://lore.kernel.org/r/20231120041422.75170-4-zhujia.zj@bytedance.com
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Stable-dep-of: 0a790040838c ("cachefiles: add spin_lock for cachefiles_ondemand_info")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/cachefiles/internal.h | 3 ++
fs/cachefiles/ondemand.c | 98 ++++++++++++++++++++++++++++------------
2 files changed, 72 insertions(+), 29 deletions(-)
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index b0fe76964bc0d..b9a90f1a0c015 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -47,9 +47,11 @@ struct cachefiles_volume {
enum cachefiles_object_state {
CACHEFILES_ONDEMAND_OBJSTATE_CLOSE, /* Anonymous fd closed by daemon or initial state */
CACHEFILES_ONDEMAND_OBJSTATE_OPEN, /* Anonymous fd associated with object is available */
+ CACHEFILES_ONDEMAND_OBJSTATE_REOPENING, /* Object that was closed and is being reopened. */
};
struct cachefiles_ondemand_info {
+ struct work_struct ondemand_work;
int ondemand_id;
enum cachefiles_object_state state;
struct cachefiles_object *object;
@@ -326,6 +328,7 @@ cachefiles_ondemand_set_object_##_state(struct cachefiles_object *object) \
CACHEFILES_OBJECT_STATE_FUNCS(open, OPEN);
CACHEFILES_OBJECT_STATE_FUNCS(close, CLOSE);
+CACHEFILES_OBJECT_STATE_FUNCS(reopening, REOPENING);
#else
static inline ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
char __user *_buffer, size_t buflen)
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
index deb7e3007aa1d..8e130de952f7d 100644
--- a/fs/cachefiles/ondemand.c
+++ b/fs/cachefiles/ondemand.c
@@ -18,14 +18,10 @@ static int cachefiles_ondemand_fd_release(struct inode *inode,
info->ondemand_id = CACHEFILES_ONDEMAND_ID_CLOSED;
cachefiles_ondemand_set_object_close(object);
- /*
- * Flush all pending READ requests since their completion depends on
- * anon_fd.
- */
- xas_for_each(&xas, req, ULONG_MAX) {
+ /* Only flush CACHEFILES_REQ_NEW marked req to avoid race with daemon_read */
+ xas_for_each_marked(&xas, req, ULONG_MAX, CACHEFILES_REQ_NEW) {
if (req->msg.object_id == object_id &&
- req->msg.opcode == CACHEFILES_OP_READ) {
- req->error = -EIO;
+ req->msg.opcode == CACHEFILES_OP_CLOSE) {
complete(&req->done);
xas_store(&xas, NULL);
}
@@ -179,6 +175,7 @@ int cachefiles_ondemand_copen(struct cachefiles_cache *cache, char *args)
trace_cachefiles_ondemand_copen(req->object, id, size);
cachefiles_ondemand_set_object_open(req->object);
+ wake_up_all(&cache->daemon_pollwq);
out:
complete(&req->done);
@@ -222,7 +219,6 @@ static int cachefiles_ondemand_get_fd(struct cachefiles_req *req)
load = (void *)req->msg.data;
load->fd = fd;
- req->msg.object_id = object_id;
object->ondemand->ondemand_id = object_id;
cachefiles_get_unbind_pincount(cache);
@@ -238,6 +234,43 @@ static int cachefiles_ondemand_get_fd(struct cachefiles_req *req)
return ret;
}
+static void ondemand_object_worker(struct work_struct *work)
+{
+ struct cachefiles_ondemand_info *info =
+ container_of(work, struct cachefiles_ondemand_info, ondemand_work);
+
+ cachefiles_ondemand_init_object(info->object);
+}
+
+/*
+ * If there are any inflight or subsequent READ requests on the
+ * closed object, reopen it.
+ * Skip read requests whose related object is reopening.
+ */
+static struct cachefiles_req *cachefiles_ondemand_select_req(struct xa_state *xas,
+ unsigned long xa_max)
+{
+ struct cachefiles_req *req;
+ struct cachefiles_object *object;
+ struct cachefiles_ondemand_info *info;
+
+ xas_for_each_marked(xas, req, xa_max, CACHEFILES_REQ_NEW) {
+ if (req->msg.opcode != CACHEFILES_OP_READ)
+ return req;
+ object = req->object;
+ info = object->ondemand;
+ if (cachefiles_ondemand_object_is_close(object)) {
+ cachefiles_ondemand_set_object_reopening(object);
+ queue_work(fscache_wq, &info->ondemand_work);
+ continue;
+ }
+ if (cachefiles_ondemand_object_is_reopening(object))
+ continue;
+ return req;
+ }
+ return NULL;
+}
+
ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
char __user *_buffer, size_t buflen)
{
@@ -248,16 +281,16 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
int ret = 0;
XA_STATE(xas, &cache->reqs, cache->req_id_next);
+ xa_lock(&cache->reqs);
/*
* Cyclically search for a request that has not ever been processed,
* to prevent requests from being processed repeatedly, and make
* request distribution fair.
*/
- xa_lock(&cache->reqs);
- req = xas_find_marked(&xas, UINT_MAX, CACHEFILES_REQ_NEW);
+ req = cachefiles_ondemand_select_req(&xas, ULONG_MAX);
if (!req && cache->req_id_next > 0) {
xas_set(&xas, 0);
- req = xas_find_marked(&xas, cache->req_id_next - 1, CACHEFILES_REQ_NEW);
+ req = cachefiles_ondemand_select_req(&xas, cache->req_id_next - 1);
}
if (!req) {
xa_unlock(&cache->reqs);
@@ -277,14 +310,18 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
xa_unlock(&cache->reqs);
id = xas.xa_index;
- msg->msg_id = id;
if (msg->opcode == CACHEFILES_OP_OPEN) {
ret = cachefiles_ondemand_get_fd(req);
- if (ret)
+ if (ret) {
+ cachefiles_ondemand_set_object_close(req->object);
goto error;
+ }
}
+ msg->msg_id = id;
+ msg->object_id = req->object->ondemand->ondemand_id;
+
if (copy_to_user(_buffer, msg, n) != 0) {
ret = -EFAULT;
goto err_put_fd;
@@ -317,19 +354,23 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object,
void *private)
{
struct cachefiles_cache *cache = object->volume->cache;
- struct cachefiles_req *req;
+ struct cachefiles_req *req = NULL;
XA_STATE(xas, &cache->reqs, 0);
int ret;
if (!test_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags))
return 0;
- if (test_bit(CACHEFILES_DEAD, &cache->flags))
- return -EIO;
+ if (test_bit(CACHEFILES_DEAD, &cache->flags)) {
+ ret = -EIO;
+ goto out;
+ }
req = kzalloc(sizeof(*req) + data_len, GFP_KERNEL);
- if (!req)
- return -ENOMEM;
+ if (!req) {
+ ret = -ENOMEM;
+ goto out;
+ }
req->object = object;
init_completion(&req->done);
@@ -367,7 +408,7 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object,
/* coupled with the barrier in cachefiles_flush_reqs() */
smp_mb();
- if (opcode != CACHEFILES_OP_OPEN &&
+ if (opcode == CACHEFILES_OP_CLOSE &&
!cachefiles_ondemand_object_is_open(object)) {
WARN_ON_ONCE(object->ondemand->ondemand_id == 0);
xas_unlock(&xas);
@@ -392,7 +433,15 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object,
wake_up_all(&cache->daemon_pollwq);
wait_for_completion(&req->done);
ret = req->error;
+ kfree(req);
+ return ret;
out:
+ /* Reset the object to close state in error handling path.
+ * If error occurs after creating the anonymous fd,
+ * cachefiles_ondemand_fd_release() will set object to close.
+ */
+ if (opcode == CACHEFILES_OP_OPEN)
+ cachefiles_ondemand_set_object_close(object);
kfree(req);
return ret;
}
@@ -439,7 +488,6 @@ static int cachefiles_ondemand_init_close_req(struct cachefiles_req *req,
if (!cachefiles_ondemand_object_is_open(object))
return -ENOENT;
- req->msg.object_id = object->ondemand->ondemand_id;
trace_cachefiles_ondemand_close(object, &req->msg);
return 0;
}
@@ -455,16 +503,7 @@ static int cachefiles_ondemand_init_read_req(struct cachefiles_req *req,
struct cachefiles_object *object = req->object;
struct cachefiles_read *load = (void *)req->msg.data;
struct cachefiles_read_ctx *read_ctx = private;
- int object_id = object->ondemand->ondemand_id;
-
- /* Stop enqueuing requests when daemon has closed anon_fd. */
- if (!cachefiles_ondemand_object_is_open(object)) {
- WARN_ON_ONCE(object_id == 0);
- pr_info_once("READ: anonymous fd closed prematurely.\n");
- return -EIO;
- }
- req->msg.object_id = object_id;
load->off = read_ctx->off;
load->len = read_ctx->len;
trace_cachefiles_ondemand_read(object, &req->msg, load);
@@ -513,6 +552,7 @@ int cachefiles_ondemand_init_obj_info(struct cachefiles_object *object,
return -ENOMEM;
object->ondemand->object = object;
+ INIT_WORK(&object->ondemand->ondemand_work, ondemand_object_worker);
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 108/267] cachefiles: add spin_lock for cachefiles_ondemand_info
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (106 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 107/267] cachefiles: resend an open request if the read requests object is closed Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 109/267] cachefiles: add restore command to recover inflight ondemand read requests Greg Kroah-Hartman
` (170 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Baokun Li, Jeff Layton,
Christian Brauner, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li <libaokun1@huawei.com>
[ Upstream commit 0a790040838c736495d5afd6b2d636f159f817f1 ]
The following concurrency may cause a read request to fail to be completed
and result in a hung:
t1 | t2
---------------------------------------------------------
cachefiles_ondemand_copen
req = xa_erase(&cache->reqs, id)
// Anon fd is maliciously closed.
cachefiles_ondemand_fd_release
xa_lock(&cache->reqs)
cachefiles_ondemand_set_object_close(object)
xa_unlock(&cache->reqs)
cachefiles_ondemand_set_object_open
// No one will ever close it again.
cachefiles_ondemand_daemon_read
cachefiles_ondemand_select_req
// Get a read req but its fd is already closed.
// The daemon can't issue a cread ioctl with an closed fd, then hung.
So add spin_lock for cachefiles_ondemand_info to protect ondemand_id and
state, thus we can avoid the above problem in cachefiles_ondemand_copen()
by using ondemand_id to determine if fd has been closed.
Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie")
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Link: https://lore.kernel.org/r/20240522114308.2402121-8-libaokun@huaweicloud.com
Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/cachefiles/internal.h | 1 +
fs/cachefiles/ondemand.c | 35 ++++++++++++++++++++++++++++++++++-
2 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index b9a90f1a0c015..33fe418aca770 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -55,6 +55,7 @@ struct cachefiles_ondemand_info {
int ondemand_id;
enum cachefiles_object_state state;
struct cachefiles_object *object;
+ spinlock_t lock;
};
/*
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
index 8e130de952f7d..8118649d30727 100644
--- a/fs/cachefiles/ondemand.c
+++ b/fs/cachefiles/ondemand.c
@@ -10,13 +10,16 @@ static int cachefiles_ondemand_fd_release(struct inode *inode,
struct cachefiles_object *object = file->private_data;
struct cachefiles_cache *cache = object->volume->cache;
struct cachefiles_ondemand_info *info = object->ondemand;
- int object_id = info->ondemand_id;
+ int object_id;
struct cachefiles_req *req;
XA_STATE(xas, &cache->reqs, 0);
xa_lock(&cache->reqs);
+ spin_lock(&info->lock);
+ object_id = info->ondemand_id;
info->ondemand_id = CACHEFILES_ONDEMAND_ID_CLOSED;
cachefiles_ondemand_set_object_close(object);
+ spin_unlock(&info->lock);
/* Only flush CACHEFILES_REQ_NEW marked req to avoid race with daemon_read */
xas_for_each_marked(&xas, req, ULONG_MAX, CACHEFILES_REQ_NEW) {
@@ -116,6 +119,7 @@ int cachefiles_ondemand_copen(struct cachefiles_cache *cache, char *args)
{
struct cachefiles_req *req;
struct fscache_cookie *cookie;
+ struct cachefiles_ondemand_info *info;
char *pid, *psize;
unsigned long id;
long size;
@@ -166,6 +170,33 @@ int cachefiles_ondemand_copen(struct cachefiles_cache *cache, char *args)
goto out;
}
+ info = req->object->ondemand;
+ spin_lock(&info->lock);
+ /*
+ * The anonymous fd was closed before copen ? Fail the request.
+ *
+ * t1 | t2
+ * ---------------------------------------------------------
+ * cachefiles_ondemand_copen
+ * req = xa_erase(&cache->reqs, id)
+ * // Anon fd is maliciously closed.
+ * cachefiles_ondemand_fd_release
+ * xa_lock(&cache->reqs)
+ * cachefiles_ondemand_set_object_close(object)
+ * xa_unlock(&cache->reqs)
+ * cachefiles_ondemand_set_object_open
+ * // No one will ever close it again.
+ * cachefiles_ondemand_daemon_read
+ * cachefiles_ondemand_select_req
+ *
+ * Get a read req but its fd is already closed. The daemon can't
+ * issue a cread ioctl with an closed fd, then hung.
+ */
+ if (info->ondemand_id == CACHEFILES_ONDEMAND_ID_CLOSED) {
+ spin_unlock(&info->lock);
+ req->error = -EBADFD;
+ goto out;
+ }
cookie = req->object->cookie;
cookie->object_size = size;
if (size)
@@ -175,6 +206,7 @@ int cachefiles_ondemand_copen(struct cachefiles_cache *cache, char *args)
trace_cachefiles_ondemand_copen(req->object, id, size);
cachefiles_ondemand_set_object_open(req->object);
+ spin_unlock(&info->lock);
wake_up_all(&cache->daemon_pollwq);
out:
@@ -552,6 +584,7 @@ int cachefiles_ondemand_init_obj_info(struct cachefiles_object *object,
return -ENOMEM;
object->ondemand->object = object;
+ spin_lock_init(&object->ondemand->lock);
INIT_WORK(&object->ondemand->ondemand_work, ondemand_object_worker);
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 109/267] cachefiles: add restore command to recover inflight ondemand read requests
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (107 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 108/267] cachefiles: add spin_lock for cachefiles_ondemand_info Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 110/267] cachefiles: fix slab-use-after-free in cachefiles_ondemand_get_fd() Greg Kroah-Hartman
` (169 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Gao Xiang, Jia Zhu, Xin Yin,
Jingbo Xu, David Howells, Christian Brauner, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jia Zhu <zhujia.zj@bytedance.com>
[ Upstream commit e73fa11a356ca0905c3cc648eaacc6f0f2d2c8b3 ]
Previously, in ondemand read scenario, if the anonymous fd was closed by
user daemon, inflight and subsequent read requests would return EIO.
As long as the device connection is not released, user daemon can hold
and restore inflight requests by setting the request flag to
CACHEFILES_REQ_NEW.
Suggested-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Jia Zhu <zhujia.zj@bytedance.com>
Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
Link: https://lore.kernel.org/r/20231120041422.75170-6-zhujia.zj@bytedance.com
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Stable-dep-of: 4b4391e77a6b ("cachefiles: defer exposing anon_fd until after copy_to_user() succeeds")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/cachefiles/daemon.c | 1 +
fs/cachefiles/internal.h | 3 +++
fs/cachefiles/ondemand.c | 23 +++++++++++++++++++++++
3 files changed, 27 insertions(+)
diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
index 7d1f456e376dd..26b487e112596 100644
--- a/fs/cachefiles/daemon.c
+++ b/fs/cachefiles/daemon.c
@@ -77,6 +77,7 @@ static const struct cachefiles_daemon_cmd cachefiles_daemon_cmds[] = {
{ "tag", cachefiles_daemon_tag },
#ifdef CONFIG_CACHEFILES_ONDEMAND
{ "copen", cachefiles_ondemand_copen },
+ { "restore", cachefiles_ondemand_restore },
#endif
{ "", NULL }
};
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 33fe418aca770..361356d0e866a 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -304,6 +304,9 @@ extern ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
extern int cachefiles_ondemand_copen(struct cachefiles_cache *cache,
char *args);
+extern int cachefiles_ondemand_restore(struct cachefiles_cache *cache,
+ char *args);
+
extern int cachefiles_ondemand_init_object(struct cachefiles_object *object);
extern void cachefiles_ondemand_clean_object(struct cachefiles_object *object);
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
index 8118649d30727..6d8f7f01a73ac 100644
--- a/fs/cachefiles/ondemand.c
+++ b/fs/cachefiles/ondemand.c
@@ -214,6 +214,29 @@ int cachefiles_ondemand_copen(struct cachefiles_cache *cache, char *args)
return ret;
}
+int cachefiles_ondemand_restore(struct cachefiles_cache *cache, char *args)
+{
+ struct cachefiles_req *req;
+
+ XA_STATE(xas, &cache->reqs, 0);
+
+ if (!test_bit(CACHEFILES_ONDEMAND_MODE, &cache->flags))
+ return -EOPNOTSUPP;
+
+ /*
+ * Reset the requests to CACHEFILES_REQ_NEW state, so that the
+ * requests have been processed halfway before the crash of the
+ * user daemon could be reprocessed after the recovery.
+ */
+ xas_lock(&xas);
+ xas_for_each(&xas, req, ULONG_MAX)
+ xas_set_mark(&xas, CACHEFILES_REQ_NEW);
+ xas_unlock(&xas);
+
+ wake_up_all(&cache->daemon_pollwq);
+ return 0;
+}
+
static int cachefiles_ondemand_get_fd(struct cachefiles_req *req)
{
struct cachefiles_object *object;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 110/267] cachefiles: fix slab-use-after-free in cachefiles_ondemand_get_fd()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (108 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 109/267] cachefiles: add restore command to recover inflight ondemand read requests Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 111/267] cachefiles: fix slab-use-after-free in cachefiles_ondemand_daemon_read() Greg Kroah-Hartman
` (168 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Hou Tao, Baokun Li, Jeff Layton,
Jia Zhu, Jingbo Xu, Christian Brauner, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li <libaokun1@huawei.com>
[ Upstream commit de3e26f9e5b76fc628077578c001c4a51bf54d06 ]
We got the following issue in a fuzz test of randomly issuing the restore
command:
==================================================================
BUG: KASAN: slab-use-after-free in cachefiles_ondemand_daemon_read+0x609/0xab0
Write of size 4 at addr ffff888109164a80 by task ondemand-04-dae/4962
CPU: 11 PID: 4962 Comm: ondemand-04-dae Not tainted 6.8.0-rc7-dirty #542
Call Trace:
kasan_report+0x94/0xc0
cachefiles_ondemand_daemon_read+0x609/0xab0
vfs_read+0x169/0xb50
ksys_read+0xf5/0x1e0
Allocated by task 626:
__kmalloc+0x1df/0x4b0
cachefiles_ondemand_send_req+0x24d/0x690
cachefiles_create_tmpfile+0x249/0xb30
cachefiles_create_file+0x6f/0x140
cachefiles_look_up_object+0x29c/0xa60
cachefiles_lookup_cookie+0x37d/0xca0
fscache_cookie_state_machine+0x43c/0x1230
[...]
Freed by task 626:
kfree+0xf1/0x2c0
cachefiles_ondemand_send_req+0x568/0x690
cachefiles_create_tmpfile+0x249/0xb30
cachefiles_create_file+0x6f/0x140
cachefiles_look_up_object+0x29c/0xa60
cachefiles_lookup_cookie+0x37d/0xca0
fscache_cookie_state_machine+0x43c/0x1230
[...]
==================================================================
Following is the process that triggers the issue:
mount | daemon_thread1 | daemon_thread2
------------------------------------------------------------
cachefiles_ondemand_init_object
cachefiles_ondemand_send_req
REQ_A = kzalloc(sizeof(*req) + data_len)
wait_for_completion(&REQ_A->done)
cachefiles_daemon_read
cachefiles_ondemand_daemon_read
REQ_A = cachefiles_ondemand_select_req
cachefiles_ondemand_get_fd
copy_to_user(_buffer, msg, n)
process_open_req(REQ_A)
------ restore ------
cachefiles_ondemand_restore
xas_for_each(&xas, req, ULONG_MAX)
xas_set_mark(&xas, CACHEFILES_REQ_NEW);
cachefiles_daemon_read
cachefiles_ondemand_daemon_read
REQ_A = cachefiles_ondemand_select_req
write(devfd, ("copen %u,%llu", msg->msg_id, size));
cachefiles_ondemand_copen
xa_erase(&cache->reqs, id)
complete(&REQ_A->done)
kfree(REQ_A)
cachefiles_ondemand_get_fd(REQ_A)
fd = get_unused_fd_flags
file = anon_inode_getfile
fd_install(fd, file)
load = (void *)REQ_A->msg.data;
load->fd = fd;
// load UAF !!!
This issue is caused by issuing a restore command when the daemon is still
alive, which results in a request being processed multiple times thus
triggering a UAF. So to avoid this problem, add an additional reference
count to cachefiles_req, which is held while waiting and reading, and then
released when the waiting and reading is over.
Note that since there is only one reference count for waiting, we need to
avoid the same request being completed multiple times, so we can only
complete the request if it is successfully removed from the xarray.
Fixes: e73fa11a356c ("cachefiles: add restore command to recover inflight ondemand read requests")
Suggested-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Link: https://lore.kernel.org/r/20240522114308.2402121-4-libaokun@huaweicloud.com
Acked-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com>
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Stable-dep-of: 4b4391e77a6b ("cachefiles: defer exposing anon_fd until after copy_to_user() succeeds")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/cachefiles/internal.h | 1 +
fs/cachefiles/ondemand.c | 23 +++++++++++++++++++----
2 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 361356d0e866a..28799c8e2c6f6 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -139,6 +139,7 @@ static inline bool cachefiles_in_ondemand_mode(struct cachefiles_cache *cache)
struct cachefiles_req {
struct cachefiles_object *object;
struct completion done;
+ refcount_t ref;
int error;
struct cachefiles_msg msg;
};
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
index 6d8f7f01a73ac..f8d0a01795702 100644
--- a/fs/cachefiles/ondemand.c
+++ b/fs/cachefiles/ondemand.c
@@ -4,6 +4,12 @@
#include <linux/uio.h>
#include "internal.h"
+static inline void cachefiles_req_put(struct cachefiles_req *req)
+{
+ if (refcount_dec_and_test(&req->ref))
+ kfree(req);
+}
+
static int cachefiles_ondemand_fd_release(struct inode *inode,
struct file *file)
{
@@ -362,6 +368,7 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
xas_clear_mark(&xas, CACHEFILES_REQ_NEW);
cache->req_id_next = xas.xa_index + 1;
+ refcount_inc(&req->ref);
xa_unlock(&cache->reqs);
id = xas.xa_index;
@@ -388,15 +395,22 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
complete(&req->done);
}
+ cachefiles_req_put(req);
return n;
err_put_fd:
if (msg->opcode == CACHEFILES_OP_OPEN)
close_fd(((struct cachefiles_open *)msg->data)->fd);
error:
- xa_erase(&cache->reqs, id);
- req->error = ret;
- complete(&req->done);
+ xas_reset(&xas);
+ xas_lock(&xas);
+ if (xas_load(&xas) == req) {
+ req->error = ret;
+ complete(&req->done);
+ xas_store(&xas, NULL);
+ }
+ xas_unlock(&xas);
+ cachefiles_req_put(req);
return ret;
}
@@ -427,6 +441,7 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object,
goto out;
}
+ refcount_set(&req->ref, 1);
req->object = object;
init_completion(&req->done);
req->msg.opcode = opcode;
@@ -488,7 +503,7 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object,
wake_up_all(&cache->daemon_pollwq);
wait_for_completion(&req->done);
ret = req->error;
- kfree(req);
+ cachefiles_req_put(req);
return ret;
out:
/* Reset the object to close state in error handling path.
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 111/267] cachefiles: fix slab-use-after-free in cachefiles_ondemand_daemon_read()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (109 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 110/267] cachefiles: fix slab-use-after-free in cachefiles_ondemand_get_fd() Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 112/267] cachefiles: remove err_put_fd label " Greg Kroah-Hartman
` (167 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Baokun Li, Jeff Layton, Jia Zhu,
Jingbo Xu, Christian Brauner, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li <libaokun1@huawei.com>
[ Upstream commit da4a827416066191aafeeccee50a8836a826ba10 ]
We got the following issue in a fuzz test of randomly issuing the restore
command:
==================================================================
BUG: KASAN: slab-use-after-free in cachefiles_ondemand_daemon_read+0xb41/0xb60
Read of size 8 at addr ffff888122e84088 by task ondemand-04-dae/963
CPU: 13 PID: 963 Comm: ondemand-04-dae Not tainted 6.8.0-dirty #564
Call Trace:
kasan_report+0x93/0xc0
cachefiles_ondemand_daemon_read+0xb41/0xb60
vfs_read+0x169/0xb50
ksys_read+0xf5/0x1e0
Allocated by task 116:
kmem_cache_alloc+0x140/0x3a0
cachefiles_lookup_cookie+0x140/0xcd0
fscache_cookie_state_machine+0x43c/0x1230
[...]
Freed by task 792:
kmem_cache_free+0xfe/0x390
cachefiles_put_object+0x241/0x480
fscache_cookie_state_machine+0x5c8/0x1230
[...]
==================================================================
Following is the process that triggers the issue:
mount | daemon_thread1 | daemon_thread2
------------------------------------------------------------
cachefiles_withdraw_cookie
cachefiles_ondemand_clean_object(object)
cachefiles_ondemand_send_req
REQ_A = kzalloc(sizeof(*req) + data_len)
wait_for_completion(&REQ_A->done)
cachefiles_daemon_read
cachefiles_ondemand_daemon_read
REQ_A = cachefiles_ondemand_select_req
msg->object_id = req->object->ondemand->ondemand_id
------ restore ------
cachefiles_ondemand_restore
xas_for_each(&xas, req, ULONG_MAX)
xas_set_mark(&xas, CACHEFILES_REQ_NEW)
cachefiles_daemon_read
cachefiles_ondemand_daemon_read
REQ_A = cachefiles_ondemand_select_req
copy_to_user(_buffer, msg, n)
xa_erase(&cache->reqs, id)
complete(&REQ_A->done)
------ close(fd) ------
cachefiles_ondemand_fd_release
cachefiles_put_object
cachefiles_put_object
kmem_cache_free(cachefiles_object_jar, object)
REQ_A->object->ondemand->ondemand_id
// object UAF !!!
When we see the request within xa_lock, req->object must not have been
freed yet, so grab the reference count of object before xa_unlock to
avoid the above issue.
Fixes: 0a7e54c1959c ("cachefiles: resend an open request if the read request's object is closed")
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Link: https://lore.kernel.org/r/20240522114308.2402121-5-libaokun@huaweicloud.com
Acked-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com>
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Stable-dep-of: 4b4391e77a6b ("cachefiles: defer exposing anon_fd until after copy_to_user() succeeds")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/cachefiles/ondemand.c | 3 +++
include/trace/events/cachefiles.h | 6 +++++-
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
index f8d0a01795702..fd73811c7ce4f 100644
--- a/fs/cachefiles/ondemand.c
+++ b/fs/cachefiles/ondemand.c
@@ -369,6 +369,7 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
xas_clear_mark(&xas, CACHEFILES_REQ_NEW);
cache->req_id_next = xas.xa_index + 1;
refcount_inc(&req->ref);
+ cachefiles_grab_object(req->object, cachefiles_obj_get_read_req);
xa_unlock(&cache->reqs);
id = xas.xa_index;
@@ -389,6 +390,7 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
goto err_put_fd;
}
+ cachefiles_put_object(req->object, cachefiles_obj_put_read_req);
/* CLOSE request has no reply */
if (msg->opcode == CACHEFILES_OP_CLOSE) {
xa_erase(&cache->reqs, id);
@@ -402,6 +404,7 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
if (msg->opcode == CACHEFILES_OP_OPEN)
close_fd(((struct cachefiles_open *)msg->data)->fd);
error:
+ cachefiles_put_object(req->object, cachefiles_obj_put_read_req);
xas_reset(&xas);
xas_lock(&xas);
if (xas_load(&xas) == req) {
diff --git a/include/trace/events/cachefiles.h b/include/trace/events/cachefiles.h
index e3213af847cdf..7d931db02b934 100644
--- a/include/trace/events/cachefiles.h
+++ b/include/trace/events/cachefiles.h
@@ -33,6 +33,8 @@ enum cachefiles_obj_ref_trace {
cachefiles_obj_see_withdrawal,
cachefiles_obj_get_ondemand_fd,
cachefiles_obj_put_ondemand_fd,
+ cachefiles_obj_get_read_req,
+ cachefiles_obj_put_read_req,
};
enum fscache_why_object_killed {
@@ -129,7 +131,9 @@ enum cachefiles_error_trace {
EM(cachefiles_obj_see_withdraw_cookie, "SEE withdraw_cookie") \
EM(cachefiles_obj_see_withdrawal, "SEE withdrawal") \
EM(cachefiles_obj_get_ondemand_fd, "GET ondemand_fd") \
- E_(cachefiles_obj_put_ondemand_fd, "PUT ondemand_fd")
+ EM(cachefiles_obj_put_ondemand_fd, "PUT ondemand_fd") \
+ EM(cachefiles_obj_get_read_req, "GET read_req") \
+ E_(cachefiles_obj_put_read_req, "PUT read_req")
#define cachefiles_coherency_traces \
EM(cachefiles_coherency_check_aux, "BAD aux ") \
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 112/267] cachefiles: remove err_put_fd label in cachefiles_ondemand_daemon_read()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (110 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 111/267] cachefiles: fix slab-use-after-free in cachefiles_ondemand_daemon_read() Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 113/267] cachefiles: never get a new anonymous fd if ondemand_id is valid Greg Kroah-Hartman
` (166 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Baokun Li, Jeff Layton, Jia Zhu,
Gao Xiang, Jingbo Xu, Christian Brauner, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li <libaokun1@huawei.com>
[ Upstream commit 3e6d704f02aa4c50c7bc5fe91a4401df249a137b ]
The err_put_fd label is only used once, so remove it to make the code
more readable. In addition, the logic for deleting error request and
CLOSE request is merged to simplify the code.
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Link: https://lore.kernel.org/r/20240522114308.2402121-6-libaokun@huaweicloud.com
Acked-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Stable-dep-of: 4b4391e77a6b ("cachefiles: defer exposing anon_fd until after copy_to_user() succeeds")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/cachefiles/ondemand.c | 45 ++++++++++++++--------------------------
1 file changed, 16 insertions(+), 29 deletions(-)
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
index fd73811c7ce4f..99b4bffad4a4f 100644
--- a/fs/cachefiles/ondemand.c
+++ b/fs/cachefiles/ondemand.c
@@ -337,7 +337,6 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
{
struct cachefiles_req *req;
struct cachefiles_msg *msg;
- unsigned long id = 0;
size_t n;
int ret = 0;
XA_STATE(xas, &cache->reqs, cache->req_id_next);
@@ -372,49 +371,37 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
cachefiles_grab_object(req->object, cachefiles_obj_get_read_req);
xa_unlock(&cache->reqs);
- id = xas.xa_index;
-
if (msg->opcode == CACHEFILES_OP_OPEN) {
ret = cachefiles_ondemand_get_fd(req);
if (ret) {
cachefiles_ondemand_set_object_close(req->object);
- goto error;
+ goto out;
}
}
- msg->msg_id = id;
+ msg->msg_id = xas.xa_index;
msg->object_id = req->object->ondemand->ondemand_id;
if (copy_to_user(_buffer, msg, n) != 0) {
ret = -EFAULT;
- goto err_put_fd;
- }
-
- cachefiles_put_object(req->object, cachefiles_obj_put_read_req);
- /* CLOSE request has no reply */
- if (msg->opcode == CACHEFILES_OP_CLOSE) {
- xa_erase(&cache->reqs, id);
- complete(&req->done);
+ if (msg->opcode == CACHEFILES_OP_OPEN)
+ close_fd(((struct cachefiles_open *)msg->data)->fd);
}
-
- cachefiles_req_put(req);
- return n;
-
-err_put_fd:
- if (msg->opcode == CACHEFILES_OP_OPEN)
- close_fd(((struct cachefiles_open *)msg->data)->fd);
-error:
+out:
cachefiles_put_object(req->object, cachefiles_obj_put_read_req);
- xas_reset(&xas);
- xas_lock(&xas);
- if (xas_load(&xas) == req) {
- req->error = ret;
- complete(&req->done);
- xas_store(&xas, NULL);
+ /* Remove error request and CLOSE request has no reply */
+ if (ret || msg->opcode == CACHEFILES_OP_CLOSE) {
+ xas_reset(&xas);
+ xas_lock(&xas);
+ if (xas_load(&xas) == req) {
+ req->error = ret;
+ complete(&req->done);
+ xas_store(&xas, NULL);
+ }
+ xas_unlock(&xas);
}
- xas_unlock(&xas);
cachefiles_req_put(req);
- return ret;
+ return ret ? ret : n;
}
typedef int (*init_req_fn)(struct cachefiles_req *req, void *private);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 113/267] cachefiles: never get a new anonymous fd if ondemand_id is valid
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (111 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 112/267] cachefiles: remove err_put_fd label " Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 114/267] cachefiles: defer exposing anon_fd until after copy_to_user() succeeds Greg Kroah-Hartman
` (165 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Baokun Li, Jeff Layton,
Christian Brauner, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li <libaokun1@huawei.com>
[ Upstream commit 4988e35e95fc938bdde0e15880fe72042fc86acf ]
Now every time the daemon reads an open request, it gets a new anonymous fd
and ondemand_id. With the introduction of "restore", it is possible to read
the same open request more than once, and therefore an object can have more
than one anonymous fd.
If the anonymous fd is not unique, the following concurrencies will result
in an fd leak:
t1 | t2 | t3
------------------------------------------------------------
cachefiles_ondemand_init_object
cachefiles_ondemand_send_req
REQ_A = kzalloc(sizeof(*req) + data_len)
wait_for_completion(&REQ_A->done)
cachefiles_daemon_read
cachefiles_ondemand_daemon_read
REQ_A = cachefiles_ondemand_select_req
cachefiles_ondemand_get_fd
load->fd = fd0
ondemand_id = object_id0
------ restore ------
cachefiles_ondemand_restore
// restore REQ_A
cachefiles_daemon_read
cachefiles_ondemand_daemon_read
REQ_A = cachefiles_ondemand_select_req
cachefiles_ondemand_get_fd
load->fd = fd1
ondemand_id = object_id1
process_open_req(REQ_A)
write(devfd, ("copen %u,%llu", msg->msg_id, size))
cachefiles_ondemand_copen
xa_erase(&cache->reqs, id)
complete(&REQ_A->done)
kfree(REQ_A)
process_open_req(REQ_A)
// copen fails due to no req
// daemon close(fd1)
cachefiles_ondemand_fd_release
// set object closed
-- umount --
cachefiles_withdraw_cookie
cachefiles_ondemand_clean_object
cachefiles_ondemand_init_close_req
if (!cachefiles_ondemand_object_is_open(object))
return -ENOENT;
// The fd0 is not closed until the daemon exits.
However, the anonymous fd holds the reference count of the object and the
object holds the reference count of the cookie. So even though the cookie
has been relinquished, it will not be unhashed and freed until the daemon
exits.
In fscache_hash_cookie(), when the same cookie is found in the hash list,
if the cookie is set with the FSCACHE_COOKIE_RELINQUISHED bit, then the new
cookie waits for the old cookie to be unhashed, while the old cookie is
waiting for the leaked fd to be closed, if the daemon does not exit in time
it will trigger a hung task.
To avoid this, allocate a new anonymous fd only if no anonymous fd has
been allocated (ondemand_id == 0) or if the previously allocated anonymous
fd has been closed (ondemand_id == -1). Moreover, returns an error if
ondemand_id is valid, letting the daemon know that the current userland
restore logic is abnormal and needs to be checked.
Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie")
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Link: https://lore.kernel.org/r/20240522114308.2402121-9-libaokun@huaweicloud.com
Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Stable-dep-of: 4b4391e77a6b ("cachefiles: defer exposing anon_fd until after copy_to_user() succeeds")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/cachefiles/ondemand.c | 34 ++++++++++++++++++++++++++++------
1 file changed, 28 insertions(+), 6 deletions(-)
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
index 99b4bffad4a4f..773c3b407a33b 100644
--- a/fs/cachefiles/ondemand.c
+++ b/fs/cachefiles/ondemand.c
@@ -14,11 +14,18 @@ static int cachefiles_ondemand_fd_release(struct inode *inode,
struct file *file)
{
struct cachefiles_object *object = file->private_data;
- struct cachefiles_cache *cache = object->volume->cache;
- struct cachefiles_ondemand_info *info = object->ondemand;
+ struct cachefiles_cache *cache;
+ struct cachefiles_ondemand_info *info;
int object_id;
struct cachefiles_req *req;
- XA_STATE(xas, &cache->reqs, 0);
+ XA_STATE(xas, NULL, 0);
+
+ if (!object)
+ return 0;
+
+ info = object->ondemand;
+ cache = object->volume->cache;
+ xas.xa = &cache->reqs;
xa_lock(&cache->reqs);
spin_lock(&info->lock);
@@ -275,22 +282,39 @@ static int cachefiles_ondemand_get_fd(struct cachefiles_req *req)
goto err_put_fd;
}
+ spin_lock(&object->ondemand->lock);
+ if (object->ondemand->ondemand_id > 0) {
+ spin_unlock(&object->ondemand->lock);
+ /* Pair with check in cachefiles_ondemand_fd_release(). */
+ file->private_data = NULL;
+ ret = -EEXIST;
+ goto err_put_file;
+ }
+
file->f_mode |= FMODE_PWRITE | FMODE_LSEEK;
fd_install(fd, file);
load = (void *)req->msg.data;
load->fd = fd;
object->ondemand->ondemand_id = object_id;
+ spin_unlock(&object->ondemand->lock);
cachefiles_get_unbind_pincount(cache);
trace_cachefiles_ondemand_open(object, &req->msg, load);
return 0;
+err_put_file:
+ fput(file);
err_put_fd:
put_unused_fd(fd);
err_free_id:
xa_erase(&cache->ondemand_ids, object_id);
err:
+ spin_lock(&object->ondemand->lock);
+ /* Avoid marking an opened object as closed. */
+ if (object->ondemand->ondemand_id <= 0)
+ cachefiles_ondemand_set_object_close(object);
+ spin_unlock(&object->ondemand->lock);
cachefiles_put_object(object, cachefiles_obj_put_ondemand_fd);
return ret;
}
@@ -373,10 +397,8 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
if (msg->opcode == CACHEFILES_OP_OPEN) {
ret = cachefiles_ondemand_get_fd(req);
- if (ret) {
- cachefiles_ondemand_set_object_close(req->object);
+ if (ret)
goto out;
- }
}
msg->msg_id = xas.xa_index;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 114/267] cachefiles: defer exposing anon_fd until after copy_to_user() succeeds
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (112 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 113/267] cachefiles: never get a new anonymous fd if ondemand_id is valid Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 115/267] cachefiles: flush all requests after setting CACHEFILES_DEAD Greg Kroah-Hartman
` (164 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Hou Tao, Baokun Li, Jeff Layton,
Christian Brauner, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li <libaokun1@huawei.com>
[ Upstream commit 4b4391e77a6bf24cba2ef1590e113d9b73b11039 ]
After installing the anonymous fd, we can now see it in userland and close
it. However, at this point we may not have gotten the reference count of
the cache, but we will put it during colse fd, so this may cause a cache
UAF.
So grab the cache reference count before fd_install(). In addition, by
kernel convention, fd is taken over by the user land after fd_install(),
and the kernel should not call close_fd() after that, i.e., it should call
fd_install() after everything is ready, thus fd_install() is called after
copy_to_user() succeeds.
Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie")
Suggested-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Link: https://lore.kernel.org/r/20240522114308.2402121-10-libaokun@huaweicloud.com
Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/cachefiles/ondemand.c | 53 +++++++++++++++++++++++++---------------
1 file changed, 33 insertions(+), 20 deletions(-)
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
index 773c3b407a33b..a8cfa5047aaf8 100644
--- a/fs/cachefiles/ondemand.c
+++ b/fs/cachefiles/ondemand.c
@@ -4,6 +4,11 @@
#include <linux/uio.h>
#include "internal.h"
+struct ondemand_anon_file {
+ struct file *file;
+ int fd;
+};
+
static inline void cachefiles_req_put(struct cachefiles_req *req)
{
if (refcount_dec_and_test(&req->ref))
@@ -250,14 +255,14 @@ int cachefiles_ondemand_restore(struct cachefiles_cache *cache, char *args)
return 0;
}
-static int cachefiles_ondemand_get_fd(struct cachefiles_req *req)
+static int cachefiles_ondemand_get_fd(struct cachefiles_req *req,
+ struct ondemand_anon_file *anon_file)
{
struct cachefiles_object *object;
struct cachefiles_cache *cache;
struct cachefiles_open *load;
- struct file *file;
u32 object_id;
- int ret, fd;
+ int ret;
object = cachefiles_grab_object(req->object,
cachefiles_obj_get_ondemand_fd);
@@ -269,16 +274,16 @@ static int cachefiles_ondemand_get_fd(struct cachefiles_req *req)
if (ret < 0)
goto err;
- fd = get_unused_fd_flags(O_WRONLY);
- if (fd < 0) {
- ret = fd;
+ anon_file->fd = get_unused_fd_flags(O_WRONLY);
+ if (anon_file->fd < 0) {
+ ret = anon_file->fd;
goto err_free_id;
}
- file = anon_inode_getfile("[cachefiles]", &cachefiles_ondemand_fd_fops,
- object, O_WRONLY);
- if (IS_ERR(file)) {
- ret = PTR_ERR(file);
+ anon_file->file = anon_inode_getfile("[cachefiles]",
+ &cachefiles_ondemand_fd_fops, object, O_WRONLY);
+ if (IS_ERR(anon_file->file)) {
+ ret = PTR_ERR(anon_file->file);
goto err_put_fd;
}
@@ -286,16 +291,15 @@ static int cachefiles_ondemand_get_fd(struct cachefiles_req *req)
if (object->ondemand->ondemand_id > 0) {
spin_unlock(&object->ondemand->lock);
/* Pair with check in cachefiles_ondemand_fd_release(). */
- file->private_data = NULL;
+ anon_file->file->private_data = NULL;
ret = -EEXIST;
goto err_put_file;
}
- file->f_mode |= FMODE_PWRITE | FMODE_LSEEK;
- fd_install(fd, file);
+ anon_file->file->f_mode |= FMODE_PWRITE | FMODE_LSEEK;
load = (void *)req->msg.data;
- load->fd = fd;
+ load->fd = anon_file->fd;
object->ondemand->ondemand_id = object_id;
spin_unlock(&object->ondemand->lock);
@@ -304,9 +308,11 @@ static int cachefiles_ondemand_get_fd(struct cachefiles_req *req)
return 0;
err_put_file:
- fput(file);
+ fput(anon_file->file);
+ anon_file->file = NULL;
err_put_fd:
- put_unused_fd(fd);
+ put_unused_fd(anon_file->fd);
+ anon_file->fd = ret;
err_free_id:
xa_erase(&cache->ondemand_ids, object_id);
err:
@@ -363,6 +369,7 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
struct cachefiles_msg *msg;
size_t n;
int ret = 0;
+ struct ondemand_anon_file anon_file;
XA_STATE(xas, &cache->reqs, cache->req_id_next);
xa_lock(&cache->reqs);
@@ -396,7 +403,7 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
xa_unlock(&cache->reqs);
if (msg->opcode == CACHEFILES_OP_OPEN) {
- ret = cachefiles_ondemand_get_fd(req);
+ ret = cachefiles_ondemand_get_fd(req, &anon_file);
if (ret)
goto out;
}
@@ -404,10 +411,16 @@ ssize_t cachefiles_ondemand_daemon_read(struct cachefiles_cache *cache,
msg->msg_id = xas.xa_index;
msg->object_id = req->object->ondemand->ondemand_id;
- if (copy_to_user(_buffer, msg, n) != 0) {
+ if (copy_to_user(_buffer, msg, n) != 0)
ret = -EFAULT;
- if (msg->opcode == CACHEFILES_OP_OPEN)
- close_fd(((struct cachefiles_open *)msg->data)->fd);
+
+ if (msg->opcode == CACHEFILES_OP_OPEN) {
+ if (ret < 0) {
+ fput(anon_file.file);
+ put_unused_fd(anon_file.fd);
+ goto out;
+ }
+ fd_install(anon_file.fd, anon_file.file);
}
out:
cachefiles_put_object(req->object, cachefiles_obj_put_read_req);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 115/267] cachefiles: flush all requests after setting CACHEFILES_DEAD
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (113 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 114/267] cachefiles: defer exposing anon_fd until after copy_to_user() succeeds Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 116/267] selftests/ftrace: Fix to check required event file Greg Kroah-Hartman
` (163 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Baokun Li, Jeff Layton,
Christian Brauner, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li <libaokun1@huawei.com>
[ Upstream commit 85e833cd7243bda7285492b0653c3abb1e2e757b ]
In ondemand mode, when the daemon is processing an open request, if the
kernel flags the cache as CACHEFILES_DEAD, the cachefiles_daemon_write()
will always return -EIO, so the daemon can't pass the copen to the kernel.
Then the kernel process that is waiting for the copen triggers a hung_task.
Since the DEAD state is irreversible, it can only be exited by closing
/dev/cachefiles. Therefore, after calling cachefiles_io_error() to mark
the cache as CACHEFILES_DEAD, if in ondemand mode, flush all requests to
avoid the above hungtask. We may still be able to read some of the cached
data before closing the fd of /dev/cachefiles.
Note that this relies on the patch that adds reference counting to the req,
otherwise it may UAF.
Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie")
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Link: https://lore.kernel.org/r/20240522114308.2402121-12-libaokun@huaweicloud.com
Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/cachefiles/daemon.c | 2 +-
fs/cachefiles/internal.h | 3 +++
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
index 26b487e112596..b9945e4f697be 100644
--- a/fs/cachefiles/daemon.c
+++ b/fs/cachefiles/daemon.c
@@ -133,7 +133,7 @@ static int cachefiles_daemon_open(struct inode *inode, struct file *file)
return 0;
}
-static void cachefiles_flush_reqs(struct cachefiles_cache *cache)
+void cachefiles_flush_reqs(struct cachefiles_cache *cache)
{
struct xarray *xa = &cache->reqs;
struct cachefiles_req *req;
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index 28799c8e2c6f6..3eea52462fc87 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -188,6 +188,7 @@ extern int cachefiles_has_space(struct cachefiles_cache *cache,
* daemon.c
*/
extern const struct file_operations cachefiles_daemon_fops;
+extern void cachefiles_flush_reqs(struct cachefiles_cache *cache);
extern void cachefiles_get_unbind_pincount(struct cachefiles_cache *cache);
extern void cachefiles_put_unbind_pincount(struct cachefiles_cache *cache);
@@ -414,6 +415,8 @@ do { \
pr_err("I/O Error: " FMT"\n", ##__VA_ARGS__); \
fscache_io_error((___cache)->cache); \
set_bit(CACHEFILES_DEAD, &(___cache)->flags); \
+ if (cachefiles_in_ondemand_mode(___cache)) \
+ cachefiles_flush_reqs(___cache); \
} while (0)
#define cachefiles_io_error_obj(object, FMT, ...) \
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 116/267] selftests/ftrace: Fix to check required event file
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (114 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 115/267] cachefiles: flush all requests after setting CACHEFILES_DEAD Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 117/267] clk: sifive: Do not register clkdevs for PRCI clocks Greg Kroah-Hartman
` (162 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Masami Hiramatsu (Google),
Shuah Khan, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
[ Upstream commit f6c3c83db1d939ebdb8c8922748ae647d8126d91 ]
The dynevent/test_duplicates.tc test case uses `syscalls/sys_enter_openat`
event for defining eprobe on it. Since this `syscalls` events depend on
CONFIG_FTRACE_SYSCALLS=y, if it is not set, the test will fail.
Add the event file to `required` line so that the test will return
`unsupported` result.
Fixes: 297e1dcdca3d ("selftests/ftrace: Add selftest for testing duplicate eprobes and kprobes")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
.../testing/selftests/ftrace/test.d/dynevent/test_duplicates.tc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/test.d/dynevent/test_duplicates.tc b/tools/testing/selftests/ftrace/test.d/dynevent/test_duplicates.tc
index d3a79da215c8b..5f72abe6fa79b 100644
--- a/tools/testing/selftests/ftrace/test.d/dynevent/test_duplicates.tc
+++ b/tools/testing/selftests/ftrace/test.d/dynevent/test_duplicates.tc
@@ -1,7 +1,7 @@
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: Generic dynamic event - check if duplicate events are caught
-# requires: dynamic_events "e[:[<group>/][<event>]] <attached-group>.<attached-event> [<args>]":README
+# requires: dynamic_events "e[:[<group>/][<event>]] <attached-group>.<attached-event> [<args>]":README events/syscalls/sys_enter_openat
echo 0 > events/enable
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 117/267] clk: sifive: Do not register clkdevs for PRCI clocks
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (115 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 116/267] selftests/ftrace: Fix to check required event file Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 118/267] NFSv4.1 enforce rootpath check in fs_location query Greg Kroah-Hartman
` (161 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Guenter Roeck, Russell King,
Samuel Holland, Stephen Boyd, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Samuel Holland <samuel.holland@sifive.com>
[ Upstream commit 2607133196c35f31892ee199ce7ffa717bea4ad1 ]
These clkdevs were unnecessary, because systems using this driver always
look up clocks using the devicetree. And as Russell King points out[1],
since the provided device name was truncated, lookups via clkdev would
never match.
Recently, commit 8d532528ff6a ("clkdev: report over-sized strings when
creating clkdev entries") caused clkdev registration to fail due to the
truncation, and this now prevents the driver from probing. Fix the
driver by removing the clkdev registration.
Link: https://lore.kernel.org/linux-clk/ZkfYqj+OcAxd9O2t@shell.armlinux.org.uk/ [1]
Fixes: 30b8e27e3b58 ("clk: sifive: add a driver for the SiFive FU540 PRCI IP block")
Fixes: 8d532528ff6a ("clkdev: report over-sized strings when creating clkdev entries")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Closes: https://lore.kernel.org/linux-clk/7eda7621-0dde-4153-89e4-172e4c095d01@roeck-us.net/
Suggested-by: Russell King <linux@armlinux.org.uk>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20240528001432.1200403-1-samuel.holland@sifive.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/clk/sifive/sifive-prci.c | 8 --------
1 file changed, 8 deletions(-)
diff --git a/drivers/clk/sifive/sifive-prci.c b/drivers/clk/sifive/sifive-prci.c
index af81eb835bc23..b1be6a2d24aa9 100644
--- a/drivers/clk/sifive/sifive-prci.c
+++ b/drivers/clk/sifive/sifive-prci.c
@@ -4,7 +4,6 @@
* Copyright (C) 2020 Zong Li
*/
-#include <linux/clkdev.h>
#include <linux/delay.h>
#include <linux/io.h>
#include <linux/of.h>
@@ -536,13 +535,6 @@ static int __prci_register_clocks(struct device *dev, struct __prci_data *pd,
return r;
}
- r = clk_hw_register_clkdev(&pic->hw, pic->name, dev_name(dev));
- if (r) {
- dev_warn(dev, "Failed to register clkdev for %s: %d\n",
- init.name, r);
- return r;
- }
-
pd->hw_clks.hws[i] = &pic->hw;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 118/267] NFSv4.1 enforce rootpath check in fs_location query
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (116 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 117/267] clk: sifive: Do not register clkdevs for PRCI clocks Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 119/267] SUNRPC: return proper error from gss_wrap_req_priv Greg Kroah-Hartman
` (160 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Olga Kornievskaia, Trond Myklebust,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Olga Kornievskaia <kolga@netapp.com>
[ Upstream commit 28568c906c1bb5f7560e18082ed7d6295860f1c2 ]
In commit 4ca9f31a2be66 ("NFSv4.1 test and add 4.1 trunking transport"),
we introduce the ability to query the NFS server for possible trunking
locations of the existing filesystem. However, we never checked the
returned file system path for these alternative locations. According
to the RFC, the server can say that the filesystem currently known
under "fs_root" of fs_location also resides under these server
locations under the following "rootpath" pathname. The client cannot
handle trunking a filesystem that reside under different location
under different paths other than what the main path is. This patch
enforces the check that fs_root path and rootpath path in fs_location
reply is the same.
Fixes: 4ca9f31a2be6 ("NFSv4.1 test and add 4.1 trunking transport")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/nfs/nfs4proc.c | 23 ++++++++++++++++++++++-
1 file changed, 22 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 41b7eafbd9287..f0953200acd08 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -4003,6 +4003,23 @@ static void test_fs_location_for_trunking(struct nfs4_fs_location *location,
}
}
+static bool _is_same_nfs4_pathname(struct nfs4_pathname *path1,
+ struct nfs4_pathname *path2)
+{
+ int i;
+
+ if (path1->ncomponents != path2->ncomponents)
+ return false;
+ for (i = 0; i < path1->ncomponents; i++) {
+ if (path1->components[i].len != path2->components[i].len)
+ return false;
+ if (memcmp(path1->components[i].data, path2->components[i].data,
+ path1->components[i].len))
+ return false;
+ }
+ return true;
+}
+
static int _nfs4_discover_trunking(struct nfs_server *server,
struct nfs_fh *fhandle)
{
@@ -4036,9 +4053,13 @@ static int _nfs4_discover_trunking(struct nfs_server *server,
if (status)
goto out_free_3;
- for (i = 0; i < locations->nlocations; i++)
+ for (i = 0; i < locations->nlocations; i++) {
+ if (!_is_same_nfs4_pathname(&locations->fs_path,
+ &locations->locations[i].rootpath))
+ continue;
test_fs_location_for_trunking(&locations->locations[i], clp,
server);
+ }
out_free_3:
kfree(locations->fattr);
out_free_2:
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 119/267] SUNRPC: return proper error from gss_wrap_req_priv
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (117 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 118/267] NFSv4.1 enforce rootpath check in fs_location query Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 120/267] NFS: add barriers when testing for NFS_FSDATA_BLOCKED Greg Kroah-Hartman
` (159 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Chen Hanxiao, Benjamin Coddington,
Chuck Lever, Trond Myklebust, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chen Hanxiao <chenhx.fnst@fujitsu.com>
[ Upstream commit 33c94d7e3cb84f6d130678d6d59ba475a6c489cf ]
don't return 0 if snd_buf->len really greater than snd_buf->buflen
Signed-off-by: Chen Hanxiao <chenhx.fnst@fujitsu.com>
Fixes: 0c77668ddb4e ("SUNRPC: Introduce trace points in rpc_auth_gss.ko")
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/sunrpc/auth_gss/auth_gss.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
index 1af71fbb0d805..00753bc5f1b14 100644
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -1875,8 +1875,10 @@ gss_wrap_req_priv(struct rpc_cred *cred, struct gss_cl_ctx *ctx,
offset = (u8 *)p - (u8 *)snd_buf->head[0].iov_base;
maj_stat = gss_wrap(ctx->gc_gss_ctx, offset, snd_buf, inpages);
/* slack space should prevent this ever happening: */
- if (unlikely(snd_buf->len > snd_buf->buflen))
+ if (unlikely(snd_buf->len > snd_buf->buflen)) {
+ status = -EIO;
goto wrap_failed;
+ }
/* We're assuming that when GSS_S_CONTEXT_EXPIRED, the encryption was
* done anyway, so it's safe to put the request on the wire: */
if (maj_stat == GSS_S_CONTEXT_EXPIRED)
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 120/267] NFS: add barriers when testing for NFS_FSDATA_BLOCKED
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (118 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 119/267] SUNRPC: return proper error from gss_wrap_req_priv Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 121/267] selftests/tracing: Fix event filter test to retry up to 10 times Greg Kroah-Hartman
` (158 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, NeilBrown, Trond Myklebust,
Sasha Levin, Richard Kojedzinszky
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: NeilBrown <neilb@suse.de>
[ Upstream commit 99bc9f2eb3f79a2b4296d9bf43153e1d10ca50d3 ]
dentry->d_fsdata is set to NFS_FSDATA_BLOCKED while unlinking or
renaming-over a file to ensure that no open succeeds while the NFS
operation progressed on the server.
Setting dentry->d_fsdata to NFS_FSDATA_BLOCKED is done under ->d_lock
after checking the refcount is not elevated. Any attempt to open the
file (through that name) will go through lookp_open() which will take
->d_lock while incrementing the refcount, we can be sure that once the
new value is set, __nfs_lookup_revalidate() *will* see the new value and
will block.
We don't have any locking guarantee that when we set ->d_fsdata to NULL,
the wait_var_event() in __nfs_lookup_revalidate() will notice.
wait/wake primitives do NOT provide barriers to guarantee order. We
must use smp_load_acquire() in wait_var_event() to ensure we look at an
up-to-date value, and must use smp_store_release() before wake_up_var().
This patch adds those barrier functions and factors out
block_revalidate() and unblock_revalidate() far clarity.
There is also a hypothetical bug in that if memory allocation fails
(which never happens in practice) we might leave ->d_fsdata locked.
This patch adds the missing call to unblock_revalidate().
Reported-and-tested-by: Richard Kojedzinszky <richard+debian+bugreport@kojedz.in>
Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071501
Fixes: 3c59366c207e ("NFS: don't unhash dentry during unlink/rename")
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/nfs/dir.c | 47 ++++++++++++++++++++++++++++++++---------------
1 file changed, 32 insertions(+), 15 deletions(-)
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 9fc5061d51b2f..2a0f069d5a096 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1802,9 +1802,10 @@ __nfs_lookup_revalidate(struct dentry *dentry, unsigned int flags,
if (parent != READ_ONCE(dentry->d_parent))
return -ECHILD;
} else {
- /* Wait for unlink to complete */
+ /* Wait for unlink to complete - see unblock_revalidate() */
wait_var_event(&dentry->d_fsdata,
- dentry->d_fsdata != NFS_FSDATA_BLOCKED);
+ smp_load_acquire(&dentry->d_fsdata)
+ != NFS_FSDATA_BLOCKED);
parent = dget_parent(dentry);
ret = reval(d_inode(parent), dentry, flags);
dput(parent);
@@ -1817,6 +1818,29 @@ static int nfs_lookup_revalidate(struct dentry *dentry, unsigned int flags)
return __nfs_lookup_revalidate(dentry, flags, nfs_do_lookup_revalidate);
}
+static void block_revalidate(struct dentry *dentry)
+{
+ /* old devname - just in case */
+ kfree(dentry->d_fsdata);
+
+ /* Any new reference that could lead to an open
+ * will take ->d_lock in lookup_open() -> d_lookup().
+ * Holding this lock ensures we cannot race with
+ * __nfs_lookup_revalidate() and removes and need
+ * for further barriers.
+ */
+ lockdep_assert_held(&dentry->d_lock);
+
+ dentry->d_fsdata = NFS_FSDATA_BLOCKED;
+}
+
+static void unblock_revalidate(struct dentry *dentry)
+{
+ /* store_release ensures wait_var_event() sees the update */
+ smp_store_release(&dentry->d_fsdata, NULL);
+ wake_up_var(&dentry->d_fsdata);
+}
+
/*
* A weaker form of d_revalidate for revalidating just the d_inode(dentry)
* when we don't really care about the dentry name. This is called when a
@@ -2499,15 +2523,12 @@ int nfs_unlink(struct inode *dir, struct dentry *dentry)
spin_unlock(&dentry->d_lock);
goto out;
}
- /* old devname */
- kfree(dentry->d_fsdata);
- dentry->d_fsdata = NFS_FSDATA_BLOCKED;
+ block_revalidate(dentry);
spin_unlock(&dentry->d_lock);
error = nfs_safe_remove(dentry);
nfs_dentry_remove_handle_error(dir, dentry, error);
- dentry->d_fsdata = NULL;
- wake_up_var(&dentry->d_fsdata);
+ unblock_revalidate(dentry);
out:
trace_nfs_unlink_exit(dir, dentry, error);
return error;
@@ -2619,8 +2640,7 @@ nfs_unblock_rename(struct rpc_task *task, struct nfs_renamedata *data)
{
struct dentry *new_dentry = data->new_dentry;
- new_dentry->d_fsdata = NULL;
- wake_up_var(&new_dentry->d_fsdata);
+ unblock_revalidate(new_dentry);
}
/*
@@ -2682,11 +2702,6 @@ int nfs_rename(struct mnt_idmap *idmap, struct inode *old_dir,
if (WARN_ON(new_dentry->d_flags & DCACHE_NFSFS_RENAMED) ||
WARN_ON(new_dentry->d_fsdata == NFS_FSDATA_BLOCKED))
goto out;
- if (new_dentry->d_fsdata) {
- /* old devname */
- kfree(new_dentry->d_fsdata);
- new_dentry->d_fsdata = NULL;
- }
spin_lock(&new_dentry->d_lock);
if (d_count(new_dentry) > 2) {
@@ -2708,7 +2723,7 @@ int nfs_rename(struct mnt_idmap *idmap, struct inode *old_dir,
new_dentry = dentry;
new_inode = NULL;
} else {
- new_dentry->d_fsdata = NFS_FSDATA_BLOCKED;
+ block_revalidate(new_dentry);
must_unblock = true;
spin_unlock(&new_dentry->d_lock);
}
@@ -2720,6 +2735,8 @@ int nfs_rename(struct mnt_idmap *idmap, struct inode *old_dir,
task = nfs_async_rename(old_dir, new_dir, old_dentry, new_dentry,
must_unblock ? nfs_unblock_rename : NULL);
if (IS_ERR(task)) {
+ if (must_unblock)
+ unblock_revalidate(new_dentry);
error = PTR_ERR(task);
goto out;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 121/267] selftests/tracing: Fix event filter test to retry up to 10 times
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (119 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 120/267] NFS: add barriers when testing for NFS_FSDATA_BLOCKED Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 122/267] nvme: fix nvme_pr_* status code parsing Greg Kroah-Hartman
` (157 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Masami Hiramatsu (Google),
Shuah Khan, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
[ Upstream commit 0f42bdf59b4e428485aa922bef871bfa6cc505e0 ]
Commit eb50d0f250e9 ("selftests/ftrace: Choose target function for filter
test from samples") choose the target function from samples, but sometimes
this test failes randomly because the target function does not hit at the
next time. So retry getting samples up to 10 times.
Fixes: eb50d0f250e9 ("selftests/ftrace: Choose target function for filter test from samples")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
.../test.d/filter/event-filter-function.tc | 20 ++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc b/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc
index 3f74c09c56b62..118247b8dd84d 100644
--- a/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc
+++ b/tools/testing/selftests/ftrace/test.d/filter/event-filter-function.tc
@@ -10,7 +10,6 @@ fail() { #msg
}
sample_events() {
- echo > trace
echo 1 > events/kmem/kmem_cache_free/enable
echo 1 > tracing_on
ls > /dev/null
@@ -22,6 +21,7 @@ echo 0 > tracing_on
echo 0 > events/enable
echo "Get the most frequently calling function"
+echo > trace
sample_events
target_func=`cat trace | grep -o 'call_site=\([^+]*\)' | sed 's/call_site=//' | sort | uniq -c | sort | tail -n 1 | sed 's/^[ 0-9]*//'`
@@ -32,7 +32,16 @@ echo > trace
echo "Test event filter function name"
echo "call_site.function == $target_func" > events/kmem/kmem_cache_free/filter
+
+sample_events
+max_retry=10
+while [ `grep kmem_cache_free trace| wc -l` -eq 0 ]; do
sample_events
+max_retry=$((max_retry - 1))
+if [ $max_retry -eq 0 ]; then
+ exit_fail
+fi
+done
hitcnt=`grep kmem_cache_free trace| grep $target_func | wc -l`
misscnt=`grep kmem_cache_free trace| grep -v $target_func | wc -l`
@@ -49,7 +58,16 @@ address=`grep " ${target_func}\$" /proc/kallsyms | cut -d' ' -f1`
echo "Test event filter function address"
echo "call_site.function == 0x$address" > events/kmem/kmem_cache_free/filter
+echo > trace
+sample_events
+max_retry=10
+while [ `grep kmem_cache_free trace| wc -l` -eq 0 ]; do
sample_events
+max_retry=$((max_retry - 1))
+if [ $max_retry -eq 0 ]; then
+ exit_fail
+fi
+done
hitcnt=`grep kmem_cache_free trace| grep $target_func | wc -l`
misscnt=`grep kmem_cache_free trace| grep -v $target_func | wc -l`
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 122/267] nvme: fix nvme_pr_* status code parsing
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (120 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 121/267] selftests/tracing: Fix event filter test to retry up to 10 times Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 123/267] drm/panel: sitronix-st7789v: Add check for of_drm_get_panel_orientation Greg Kroah-Hartman
` (156 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Weiwen Hu, Keith Busch, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Weiwen Hu <huweiwen@linux.alibaba.com>
[ Upstream commit b1a1fdd7096dd2d67911b07f8118ff113d815db4 ]
Fix the parsing if extra status bits (e.g. MORE) is present.
Fixes: 7fb42780d06c ("nvme: Convert NVMe errors to PR errors")
Signed-off-by: Weiwen Hu <huweiwen@linux.alibaba.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/nvme/host/pr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/pr.c b/drivers/nvme/host/pr.c
index 391b1465ebfd5..803efc97fd1ea 100644
--- a/drivers/nvme/host/pr.c
+++ b/drivers/nvme/host/pr.c
@@ -77,7 +77,7 @@ static int nvme_sc_to_pr_err(int nvme_sc)
if (nvme_is_path_error(nvme_sc))
return PR_STS_PATH_FAILED;
- switch (nvme_sc) {
+ switch (nvme_sc & 0x7ff) {
case NVME_SC_SUCCESS:
return PR_STS_SUCCESS;
case NVME_SC_RESERVATION_CONFLICT:
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 123/267] drm/panel: sitronix-st7789v: Add check for of_drm_get_panel_orientation
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (121 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 122/267] nvme: fix nvme_pr_* status code parsing Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 124/267] platform/x86: dell-smbios: Fix wrong token data in sysfs Greg Kroah-Hartman
` (155 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Chen Ni, Michael Riesch,
Jessica Zhang, Neil Armstrong, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chen Ni <nichen@iscas.ac.cn>
[ Upstream commit 629f2b4e05225e53125aaf7ff0b87d5d53897128 ]
Add check for the return value of of_drm_get_panel_orientation() and
return the error if it fails in order to catch the error.
Fixes: b27c0f6d208d ("drm/panel: sitronix-st7789v: add panel orientation support")
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Michael Riesch <michael.riesch@wolfvision.net>
Acked-by: Jessica Zhang <quic_jesszhan@quicinc.com>
Link: https://lore.kernel.org/r/20240528030832.2529471-1-nichen@iscas.ac.cn
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240528030832.2529471-1-nichen@iscas.ac.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpu/drm/panel/panel-sitronix-st7789v.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/panel/panel-sitronix-st7789v.c b/drivers/gpu/drm/panel/panel-sitronix-st7789v.c
index e8f385b9c6182..28bfc48a91272 100644
--- a/drivers/gpu/drm/panel/panel-sitronix-st7789v.c
+++ b/drivers/gpu/drm/panel/panel-sitronix-st7789v.c
@@ -643,7 +643,9 @@ static int st7789v_probe(struct spi_device *spi)
if (ret)
return dev_err_probe(dev, ret, "Failed to get backlight\n");
- of_drm_get_panel_orientation(spi->dev.of_node, &ctx->orientation);
+ ret = of_drm_get_panel_orientation(spi->dev.of_node, &ctx->orientation);
+ if (ret)
+ return dev_err_probe(&spi->dev, ret, "Failed to get orientation\n");
drm_panel_add(&ctx->panel);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 124/267] platform/x86: dell-smbios: Fix wrong token data in sysfs
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (122 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 123/267] drm/panel: sitronix-st7789v: Add check for of_drm_get_panel_orientation Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 125/267] gpio: tqmx86: fix typo in Kconfig label Greg Kroah-Hartman
` (154 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Armin Wolf, Ilpo Järvinen,
Hans de Goede, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Armin Wolf <W_Armin@gmx.de>
[ Upstream commit 1981b296f858010eae409548fd297659b2cc570e ]
When reading token data from sysfs on my Inspiron 3505, the token
locations and values are wrong. This happens because match_attribute()
blindly assumes that all entries in da_tokens have an associated
entry in token_attrs.
This however is not true as soon as da_tokens[] contains zeroed
token entries. Those entries are being skipped when initialising
token_attrs, breaking the core assumption of match_attribute().
Fix this by defining an extra struct for each pair of token attributes
and use container_of() to retrieve token information.
Tested on a Dell Inspiron 3050.
Fixes: 33b9ca1e53b4 ("platform/x86: dell-smbios: Add a sysfs interface for SMBIOS tokens")
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20240528204903.445546-1-W_Armin@gmx.de
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/platform/x86/dell/dell-smbios-base.c | 92 ++++++++------------
1 file changed, 36 insertions(+), 56 deletions(-)
diff --git a/drivers/platform/x86/dell/dell-smbios-base.c b/drivers/platform/x86/dell/dell-smbios-base.c
index e61bfaf8b5c48..86b95206cb1bd 100644
--- a/drivers/platform/x86/dell/dell-smbios-base.c
+++ b/drivers/platform/x86/dell/dell-smbios-base.c
@@ -11,6 +11,7 @@
*/
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+#include <linux/container_of.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/capability.h>
@@ -25,11 +26,16 @@ static u32 da_supported_commands;
static int da_num_tokens;
static struct platform_device *platform_device;
static struct calling_interface_token *da_tokens;
-static struct device_attribute *token_location_attrs;
-static struct device_attribute *token_value_attrs;
+static struct token_sysfs_data *token_entries;
static struct attribute **token_attrs;
static DEFINE_MUTEX(smbios_mutex);
+struct token_sysfs_data {
+ struct device_attribute location_attr;
+ struct device_attribute value_attr;
+ struct calling_interface_token *token;
+};
+
struct smbios_device {
struct list_head list;
struct device *device;
@@ -416,47 +422,26 @@ static void __init find_tokens(const struct dmi_header *dm, void *dummy)
}
}
-static int match_attribute(struct device *dev,
- struct device_attribute *attr)
-{
- int i;
-
- for (i = 0; i < da_num_tokens * 2; i++) {
- if (!token_attrs[i])
- continue;
- if (strcmp(token_attrs[i]->name, attr->attr.name) == 0)
- return i/2;
- }
- dev_dbg(dev, "couldn't match: %s\n", attr->attr.name);
- return -EINVAL;
-}
-
static ssize_t location_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
- int i;
+ struct token_sysfs_data *data = container_of(attr, struct token_sysfs_data, location_attr);
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
- i = match_attribute(dev, attr);
- if (i > 0)
- return sysfs_emit(buf, "%08x", da_tokens[i].location);
- return 0;
+ return sysfs_emit(buf, "%08x", data->token->location);
}
static ssize_t value_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
- int i;
+ struct token_sysfs_data *data = container_of(attr, struct token_sysfs_data, value_attr);
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
- i = match_attribute(dev, attr);
- if (i > 0)
- return sysfs_emit(buf, "%08x", da_tokens[i].value);
- return 0;
+ return sysfs_emit(buf, "%08x", data->token->value);
}
static struct attribute_group smbios_attribute_group = {
@@ -473,22 +458,15 @@ static int build_tokens_sysfs(struct platform_device *dev)
{
char *location_name;
char *value_name;
- size_t size;
int ret;
int i, j;
- /* (number of tokens + 1 for null terminated */
- size = sizeof(struct device_attribute) * (da_num_tokens + 1);
- token_location_attrs = kzalloc(size, GFP_KERNEL);
- if (!token_location_attrs)
+ token_entries = kcalloc(da_num_tokens, sizeof(*token_entries), GFP_KERNEL);
+ if (!token_entries)
return -ENOMEM;
- token_value_attrs = kzalloc(size, GFP_KERNEL);
- if (!token_value_attrs)
- goto out_allocate_value;
/* need to store both location and value + terminator*/
- size = sizeof(struct attribute *) * ((2 * da_num_tokens) + 1);
- token_attrs = kzalloc(size, GFP_KERNEL);
+ token_attrs = kcalloc((2 * da_num_tokens) + 1, sizeof(*token_attrs), GFP_KERNEL);
if (!token_attrs)
goto out_allocate_attrs;
@@ -496,27 +474,32 @@ static int build_tokens_sysfs(struct platform_device *dev)
/* skip empty */
if (da_tokens[i].tokenID == 0)
continue;
+
+ token_entries[i].token = &da_tokens[i];
+
/* add location */
location_name = kasprintf(GFP_KERNEL, "%04x_location",
da_tokens[i].tokenID);
if (location_name == NULL)
goto out_unwind_strings;
- sysfs_attr_init(&token_location_attrs[i].attr);
- token_location_attrs[i].attr.name = location_name;
- token_location_attrs[i].attr.mode = 0444;
- token_location_attrs[i].show = location_show;
- token_attrs[j++] = &token_location_attrs[i].attr;
+
+ sysfs_attr_init(&token_entries[i].location_attr.attr);
+ token_entries[i].location_attr.attr.name = location_name;
+ token_entries[i].location_attr.attr.mode = 0444;
+ token_entries[i].location_attr.show = location_show;
+ token_attrs[j++] = &token_entries[i].location_attr.attr;
/* add value */
value_name = kasprintf(GFP_KERNEL, "%04x_value",
da_tokens[i].tokenID);
if (value_name == NULL)
goto loop_fail_create_value;
- sysfs_attr_init(&token_value_attrs[i].attr);
- token_value_attrs[i].attr.name = value_name;
- token_value_attrs[i].attr.mode = 0444;
- token_value_attrs[i].show = value_show;
- token_attrs[j++] = &token_value_attrs[i].attr;
+
+ sysfs_attr_init(&token_entries[i].value_attr.attr);
+ token_entries[i].value_attr.attr.name = value_name;
+ token_entries[i].value_attr.attr.mode = 0444;
+ token_entries[i].value_attr.show = value_show;
+ token_attrs[j++] = &token_entries[i].value_attr.attr;
continue;
loop_fail_create_value:
@@ -532,14 +515,12 @@ static int build_tokens_sysfs(struct platform_device *dev)
out_unwind_strings:
while (i--) {
- kfree(token_location_attrs[i].attr.name);
- kfree(token_value_attrs[i].attr.name);
+ kfree(token_entries[i].location_attr.attr.name);
+ kfree(token_entries[i].value_attr.attr.name);
}
kfree(token_attrs);
out_allocate_attrs:
- kfree(token_value_attrs);
-out_allocate_value:
- kfree(token_location_attrs);
+ kfree(token_entries);
return -ENOMEM;
}
@@ -551,12 +532,11 @@ static void free_group(struct platform_device *pdev)
sysfs_remove_group(&pdev->dev.kobj,
&smbios_attribute_group);
for (i = 0; i < da_num_tokens; i++) {
- kfree(token_location_attrs[i].attr.name);
- kfree(token_value_attrs[i].attr.name);
+ kfree(token_entries[i].location_attr.attr.name);
+ kfree(token_entries[i].value_attr.attr.name);
}
kfree(token_attrs);
- kfree(token_value_attrs);
- kfree(token_location_attrs);
+ kfree(token_entries);
}
static int __init dell_smbios_init(void)
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 125/267] gpio: tqmx86: fix typo in Kconfig label
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (123 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 124/267] platform/x86: dell-smbios: Fix wrong token data in sysfs Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 126/267] gpio: tqmx86: introduce shadow register for GPIO output value Greg Kroah-Hartman
` (153 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Gregor Herburger, Matthias Schiffer,
Andrew Lunn, Bartosz Golaszewski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gregor Herburger <gregor.herburger@tq-group.com>
[ Upstream commit 8c219e52ca4d9a67cd6a7074e91bf29b55edc075 ]
Fix description for GPIO_TQMX86 from QTMX86 to TQMx86.
Fixes: b868db94a6a7 ("gpio: tqmx86: Add GPIO from for this IO controller")
Signed-off-by: Gregor Herburger <gregor.herburger@tq-group.com>
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/e0e38c9944ad6d281d9a662a45d289b88edc808e.1717063994.git.matthias.schiffer@ew.tq-group.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpio/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
index d56b835359d3b..ebd4e113dc265 100644
--- a/drivers/gpio/Kconfig
+++ b/drivers/gpio/Kconfig
@@ -1507,7 +1507,7 @@ config GPIO_TPS68470
are "output only" GPIOs.
config GPIO_TQMX86
- tristate "TQ-Systems QTMX86 GPIO"
+ tristate "TQ-Systems TQMx86 GPIO"
depends on MFD_TQMX86 || COMPILE_TEST
depends on HAS_IOPORT_MAP
select GPIOLIB_IRQCHIP
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 126/267] gpio: tqmx86: introduce shadow register for GPIO output value
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (124 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 125/267] gpio: tqmx86: fix typo in Kconfig label Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 127/267] gpio: tqmx86: store IRQ trigger type and unmask status separately Greg Kroah-Hartman
` (152 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Matthias Schiffer, Andrew Lunn,
Bartosz Golaszewski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
[ Upstream commit 9d6a811b522ba558bcb4ec01d12e72a0af8e9f6e ]
The TQMx86 GPIO controller uses the same register address for input and
output data. Reading the register will always return current inputs
rather than the previously set outputs (regardless of the current
direction setting). Therefore, using a RMW pattern does not make sense
when setting output values. Instead, the previously set output register
value needs to be stored as a shadow register.
As there is no reliable way to get the current output values from the
hardware, also initialize all channels to 0, to ensure that stored and
actual output values match. This should usually not have any effect in
practise, as the TQMx86 UEFI sets all outputs to 0 during boot.
Also prepare for extension of the driver to more than 8 GPIOs by using
DECLARE_BITMAP.
Fixes: b868db94a6a7 ("gpio: tqmx86: Add GPIO from for this IO controller")
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/d0555933becd45fa92a85675d26e4d59343ddc01.1717063994.git.matthias.schiffer@ew.tq-group.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpio/gpio-tqmx86.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/drivers/gpio/gpio-tqmx86.c b/drivers/gpio/gpio-tqmx86.c
index 3a28c1f273c39..b7e2dbbdc4ebe 100644
--- a/drivers/gpio/gpio-tqmx86.c
+++ b/drivers/gpio/gpio-tqmx86.c
@@ -6,6 +6,7 @@
* Vadim V.Vlasov <vvlasov@dev.rtsoft.ru>
*/
+#include <linux/bitmap.h>
#include <linux/bitops.h>
#include <linux/errno.h>
#include <linux/gpio/driver.h>
@@ -38,6 +39,7 @@ struct tqmx86_gpio_data {
void __iomem *io_base;
int irq;
raw_spinlock_t spinlock;
+ DECLARE_BITMAP(output, TQMX86_NGPIO);
u8 irq_type[TQMX86_NGPI];
};
@@ -64,15 +66,10 @@ static void tqmx86_gpio_set(struct gpio_chip *chip, unsigned int offset,
{
struct tqmx86_gpio_data *gpio = gpiochip_get_data(chip);
unsigned long flags;
- u8 val;
raw_spin_lock_irqsave(&gpio->spinlock, flags);
- val = tqmx86_gpio_read(gpio, TQMX86_GPIOD);
- if (value)
- val |= BIT(offset);
- else
- val &= ~BIT(offset);
- tqmx86_gpio_write(gpio, val, TQMX86_GPIOD);
+ __assign_bit(offset, gpio->output, value);
+ tqmx86_gpio_write(gpio, bitmap_get_value8(gpio->output, 0), TQMX86_GPIOD);
raw_spin_unlock_irqrestore(&gpio->spinlock, flags);
}
@@ -277,6 +274,13 @@ static int tqmx86_gpio_probe(struct platform_device *pdev)
tqmx86_gpio_write(gpio, (u8)~TQMX86_DIR_INPUT_MASK, TQMX86_GPIODD);
+ /*
+ * Reading the previous output state is not possible with TQMx86 hardware.
+ * Initialize all outputs to 0 to have a defined state that matches the
+ * shadow register.
+ */
+ tqmx86_gpio_write(gpio, 0, TQMX86_GPIOD);
+
chip = &gpio->chip;
chip->label = "gpio-tqmx86";
chip->owner = THIS_MODULE;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 127/267] gpio: tqmx86: store IRQ trigger type and unmask status separately
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (125 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 126/267] gpio: tqmx86: introduce shadow register for GPIO output value Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 128/267] gpio: tqmx86: fix broken IRQ_TYPE_EDGE_BOTH interrupt type Greg Kroah-Hartman
` (151 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Matthias Schiffer,
Bartosz Golaszewski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
[ Upstream commit 08af509efdf8dad08e972b48de0e2c2a7919ea8b ]
irq_set_type() should not implicitly unmask the IRQ.
All accesses to the interrupt configuration register are moved to a new
helper tqmx86_gpio_irq_config(). We also introduce the new rule that
accessing irq_type must happen while locked, which will become
significant for fixing EDGE_BOTH handling.
Fixes: b868db94a6a7 ("gpio: tqmx86: Add GPIO from for this IO controller")
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Link: https://lore.kernel.org/r/6aa4f207f77cb58ef64ffb947e91949b0f753ccd.1717063994.git.matthias.schiffer@ew.tq-group.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpio/gpio-tqmx86.c | 48 ++++++++++++++++++++++----------------
1 file changed, 28 insertions(+), 20 deletions(-)
diff --git a/drivers/gpio/gpio-tqmx86.c b/drivers/gpio/gpio-tqmx86.c
index b7e2dbbdc4ebe..7e428c872a257 100644
--- a/drivers/gpio/gpio-tqmx86.c
+++ b/drivers/gpio/gpio-tqmx86.c
@@ -29,15 +29,19 @@
#define TQMX86_GPIIC 3 /* GPI Interrupt Configuration Register */
#define TQMX86_GPIIS 4 /* GPI Interrupt Status Register */
+#define TQMX86_GPII_NONE 0
#define TQMX86_GPII_FALLING BIT(0)
#define TQMX86_GPII_RISING BIT(1)
#define TQMX86_GPII_MASK (BIT(0) | BIT(1))
#define TQMX86_GPII_BITS 2
+/* Stored in irq_type with GPII bits */
+#define TQMX86_INT_UNMASKED BIT(2)
struct tqmx86_gpio_data {
struct gpio_chip chip;
void __iomem *io_base;
int irq;
+ /* Lock must be held for accessing output and irq_type fields */
raw_spinlock_t spinlock;
DECLARE_BITMAP(output, TQMX86_NGPIO);
u8 irq_type[TQMX86_NGPI];
@@ -104,21 +108,32 @@ static int tqmx86_gpio_get_direction(struct gpio_chip *chip,
return GPIO_LINE_DIRECTION_OUT;
}
+static void tqmx86_gpio_irq_config(struct tqmx86_gpio_data *gpio, int offset)
+ __must_hold(&gpio->spinlock)
+{
+ u8 type = TQMX86_GPII_NONE, gpiic;
+
+ if (gpio->irq_type[offset] & TQMX86_INT_UNMASKED)
+ type = gpio->irq_type[offset] & TQMX86_GPII_MASK;
+
+ gpiic = tqmx86_gpio_read(gpio, TQMX86_GPIIC);
+ gpiic &= ~(TQMX86_GPII_MASK << (offset * TQMX86_GPII_BITS));
+ gpiic |= type << (offset * TQMX86_GPII_BITS);
+ tqmx86_gpio_write(gpio, gpiic, TQMX86_GPIIC);
+}
+
static void tqmx86_gpio_irq_mask(struct irq_data *data)
{
unsigned int offset = (data->hwirq - TQMX86_NGPO);
struct tqmx86_gpio_data *gpio = gpiochip_get_data(
irq_data_get_irq_chip_data(data));
unsigned long flags;
- u8 gpiic, mask;
-
- mask = TQMX86_GPII_MASK << (offset * TQMX86_GPII_BITS);
raw_spin_lock_irqsave(&gpio->spinlock, flags);
- gpiic = tqmx86_gpio_read(gpio, TQMX86_GPIIC);
- gpiic &= ~mask;
- tqmx86_gpio_write(gpio, gpiic, TQMX86_GPIIC);
+ gpio->irq_type[offset] &= ~TQMX86_INT_UNMASKED;
+ tqmx86_gpio_irq_config(gpio, offset);
raw_spin_unlock_irqrestore(&gpio->spinlock, flags);
+
gpiochip_disable_irq(&gpio->chip, irqd_to_hwirq(data));
}
@@ -128,16 +143,12 @@ static void tqmx86_gpio_irq_unmask(struct irq_data *data)
struct tqmx86_gpio_data *gpio = gpiochip_get_data(
irq_data_get_irq_chip_data(data));
unsigned long flags;
- u8 gpiic, mask;
-
- mask = TQMX86_GPII_MASK << (offset * TQMX86_GPII_BITS);
gpiochip_enable_irq(&gpio->chip, irqd_to_hwirq(data));
+
raw_spin_lock_irqsave(&gpio->spinlock, flags);
- gpiic = tqmx86_gpio_read(gpio, TQMX86_GPIIC);
- gpiic &= ~mask;
- gpiic |= gpio->irq_type[offset] << (offset * TQMX86_GPII_BITS);
- tqmx86_gpio_write(gpio, gpiic, TQMX86_GPIIC);
+ gpio->irq_type[offset] |= TQMX86_INT_UNMASKED;
+ tqmx86_gpio_irq_config(gpio, offset);
raw_spin_unlock_irqrestore(&gpio->spinlock, flags);
}
@@ -148,7 +159,7 @@ static int tqmx86_gpio_irq_set_type(struct irq_data *data, unsigned int type)
unsigned int offset = (data->hwirq - TQMX86_NGPO);
unsigned int edge_type = type & IRQF_TRIGGER_MASK;
unsigned long flags;
- u8 new_type, gpiic;
+ u8 new_type;
switch (edge_type) {
case IRQ_TYPE_EDGE_RISING:
@@ -164,13 +175,10 @@ static int tqmx86_gpio_irq_set_type(struct irq_data *data, unsigned int type)
return -EINVAL; /* not supported */
}
- gpio->irq_type[offset] = new_type;
-
raw_spin_lock_irqsave(&gpio->spinlock, flags);
- gpiic = tqmx86_gpio_read(gpio, TQMX86_GPIIC);
- gpiic &= ~((TQMX86_GPII_MASK) << (offset * TQMX86_GPII_BITS));
- gpiic |= new_type << (offset * TQMX86_GPII_BITS);
- tqmx86_gpio_write(gpio, gpiic, TQMX86_GPIIC);
+ gpio->irq_type[offset] &= ~TQMX86_GPII_MASK;
+ gpio->irq_type[offset] |= new_type;
+ tqmx86_gpio_irq_config(gpio, offset);
raw_spin_unlock_irqrestore(&gpio->spinlock, flags);
return 0;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 128/267] gpio: tqmx86: fix broken IRQ_TYPE_EDGE_BOTH interrupt type
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (126 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 127/267] gpio: tqmx86: store IRQ trigger type and unmask status separately Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 129/267] HID: core: remove unnecessary WARN_ON() in implement() Greg Kroah-Hartman
` (150 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Gregor Herburger, Matthias Schiffer,
Bartosz Golaszewski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
[ Upstream commit 90dd7de4ef7ba584823dfbeba834c2919a4bb55b ]
The TQMx86 GPIO controller only supports falling and rising edge
triggers, but not both. Fix this by implementing a software both-edge
mode that toggles the edge type after every interrupt.
Fixes: b868db94a6a7 ("gpio: tqmx86: Add GPIO from for this IO controller")
Co-developed-by: Gregor Herburger <gregor.herburger@tq-group.com>
Signed-off-by: Gregor Herburger <gregor.herburger@tq-group.com>
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Link: https://lore.kernel.org/r/515324f0491c4d44f4ef49f170354aca002d81ef.1717063994.git.matthias.schiffer@ew.tq-group.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpio/gpio-tqmx86.c | 46 ++++++++++++++++++++++++++++++++++----
1 file changed, 42 insertions(+), 4 deletions(-)
diff --git a/drivers/gpio/gpio-tqmx86.c b/drivers/gpio/gpio-tqmx86.c
index 7e428c872a257..f2e7e8754d95d 100644
--- a/drivers/gpio/gpio-tqmx86.c
+++ b/drivers/gpio/gpio-tqmx86.c
@@ -32,6 +32,10 @@
#define TQMX86_GPII_NONE 0
#define TQMX86_GPII_FALLING BIT(0)
#define TQMX86_GPII_RISING BIT(1)
+/* Stored in irq_type as a trigger type, but not actually valid as a register
+ * value, so the name doesn't use "GPII"
+ */
+#define TQMX86_INT_BOTH (BIT(0) | BIT(1))
#define TQMX86_GPII_MASK (BIT(0) | BIT(1))
#define TQMX86_GPII_BITS 2
/* Stored in irq_type with GPII bits */
@@ -113,9 +117,15 @@ static void tqmx86_gpio_irq_config(struct tqmx86_gpio_data *gpio, int offset)
{
u8 type = TQMX86_GPII_NONE, gpiic;
- if (gpio->irq_type[offset] & TQMX86_INT_UNMASKED)
+ if (gpio->irq_type[offset] & TQMX86_INT_UNMASKED) {
type = gpio->irq_type[offset] & TQMX86_GPII_MASK;
+ if (type == TQMX86_INT_BOTH)
+ type = tqmx86_gpio_get(&gpio->chip, offset + TQMX86_NGPO)
+ ? TQMX86_GPII_FALLING
+ : TQMX86_GPII_RISING;
+ }
+
gpiic = tqmx86_gpio_read(gpio, TQMX86_GPIIC);
gpiic &= ~(TQMX86_GPII_MASK << (offset * TQMX86_GPII_BITS));
gpiic |= type << (offset * TQMX86_GPII_BITS);
@@ -169,7 +179,7 @@ static int tqmx86_gpio_irq_set_type(struct irq_data *data, unsigned int type)
new_type = TQMX86_GPII_FALLING;
break;
case IRQ_TYPE_EDGE_BOTH:
- new_type = TQMX86_GPII_FALLING | TQMX86_GPII_RISING;
+ new_type = TQMX86_INT_BOTH;
break;
default:
return -EINVAL; /* not supported */
@@ -189,8 +199,8 @@ static void tqmx86_gpio_irq_handler(struct irq_desc *desc)
struct gpio_chip *chip = irq_desc_get_handler_data(desc);
struct tqmx86_gpio_data *gpio = gpiochip_get_data(chip);
struct irq_chip *irq_chip = irq_desc_get_chip(desc);
- unsigned long irq_bits;
- int i = 0;
+ unsigned long irq_bits, flags;
+ int i;
u8 irq_status;
chained_irq_enter(irq_chip, desc);
@@ -199,6 +209,34 @@ static void tqmx86_gpio_irq_handler(struct irq_desc *desc)
tqmx86_gpio_write(gpio, irq_status, TQMX86_GPIIS);
irq_bits = irq_status;
+
+ raw_spin_lock_irqsave(&gpio->spinlock, flags);
+ for_each_set_bit(i, &irq_bits, TQMX86_NGPI) {
+ /*
+ * Edge-both triggers are implemented by flipping the edge
+ * trigger after each interrupt, as the controller only supports
+ * either rising or falling edge triggers, but not both.
+ *
+ * Internally, the TQMx86 GPIO controller has separate status
+ * registers for rising and falling edge interrupts. GPIIC
+ * configures which bits from which register are visible in the
+ * interrupt status register GPIIS and defines what triggers the
+ * parent IRQ line. Writing to GPIIS always clears both rising
+ * and falling interrupt flags internally, regardless of the
+ * currently configured trigger.
+ *
+ * In consequence, we can cleanly implement the edge-both
+ * trigger in software by first clearing the interrupt and then
+ * setting the new trigger based on the current GPIO input in
+ * tqmx86_gpio_irq_config() - even if an edge arrives between
+ * reading the input and setting the trigger, we will have a new
+ * interrupt pending.
+ */
+ if ((gpio->irq_type[i] & TQMX86_GPII_MASK) == TQMX86_INT_BOTH)
+ tqmx86_gpio_irq_config(gpio, i);
+ }
+ raw_spin_unlock_irqrestore(&gpio->spinlock, flags);
+
for_each_set_bit(i, &irq_bits, TQMX86_NGPI)
generic_handle_domain_irq(gpio->chip.irq.domain,
i + TQMX86_NGPO);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 129/267] HID: core: remove unnecessary WARN_ON() in implement()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (127 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 128/267] gpio: tqmx86: fix broken IRQ_TYPE_EDGE_BOTH interrupt type Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 130/267] iommu/amd: Fix sysfs leak in iommu init Greg Kroah-Hartman
` (149 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, syzbot+5186630949e3c55f0799,
Alan Stern, Nikita Zhandarovich, Jiri Kosina, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
[ Upstream commit 4aa2dcfbad538adf7becd0034a3754e1bd01b2b5 ]
Syzkaller hit a warning [1] in a call to implement() when trying
to write a value into a field of smaller size in an output report.
Since implement() already has a warn message printed out with the
help of hid_warn() and value in question gets trimmed with:
...
value &= m;
...
WARN_ON may be considered superfluous. Remove it to suppress future
syzkaller triggers.
[1]
WARNING: CPU: 0 PID: 5084 at drivers/hid/hid-core.c:1451 implement drivers/hid/hid-core.c:1451 [inline]
WARNING: CPU: 0 PID: 5084 at drivers/hid/hid-core.c:1451 hid_output_report+0x548/0x760 drivers/hid/hid-core.c:1863
Modules linked in:
CPU: 0 PID: 5084 Comm: syz-executor424 Not tainted 6.9.0-rc7-syzkaller-00183-gcf87f46fd34d #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
RIP: 0010:implement drivers/hid/hid-core.c:1451 [inline]
RIP: 0010:hid_output_report+0x548/0x760 drivers/hid/hid-core.c:1863
...
Call Trace:
<TASK>
__usbhid_submit_report drivers/hid/usbhid/hid-core.c:591 [inline]
usbhid_submit_report+0x43d/0x9e0 drivers/hid/usbhid/hid-core.c:636
hiddev_ioctl+0x138b/0x1f00 drivers/hid/usbhid/hiddev.c:726
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:904 [inline]
__se_sys_ioctl+0xfc/0x170 fs/ioctl.c:890
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
...
Fixes: 95d1c8951e5b ("HID: simplify implement() a bit")
Reported-by: <syzbot+5186630949e3c55f0799@syzkaller.appspotmail.com>
Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/hid/hid-core.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index e0181218ad857..85ddeb13a3fae 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -1448,7 +1448,6 @@ static void implement(const struct hid_device *hid, u8 *report,
hid_warn(hid,
"%s() called with too large value %d (n: %d)! (%s)\n",
__func__, value, n, current->comm);
- WARN_ON(1);
value &= m;
}
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 130/267] iommu/amd: Fix sysfs leak in iommu init
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (128 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 129/267] HID: core: remove unnecessary WARN_ON() in implement() Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 131/267] iommu: Return right value in iommu_sva_bind_device() Greg Kroah-Hartman
` (148 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Kun(llfl), Suravee Suthikulpanit,
Joerg Roedel, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kun(llfl) <llfl@linux.alibaba.com>
[ Upstream commit a295ec52c8624883885396fde7b4df1a179627c3 ]
During the iommu initialization, iommu_init_pci() adds sysfs nodes.
However, these nodes aren't remove in free_iommu_resources() subsequently.
Fixes: 39ab9555c241 ("iommu: Add sysfs bindings for struct iommu_device")
Signed-off-by: Kun(llfl) <llfl@linux.alibaba.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/c8e0d11c6ab1ee48299c288009cf9c5dae07b42d.1715215003.git.llfl@linux.alibaba.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/iommu/amd/init.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index a2ad2dbd04d92..ef3fae113dd64 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -1692,8 +1692,17 @@ static void __init free_pci_segments(void)
}
}
+static void __init free_sysfs(struct amd_iommu *iommu)
+{
+ if (iommu->iommu.dev) {
+ iommu_device_unregister(&iommu->iommu);
+ iommu_device_sysfs_remove(&iommu->iommu);
+ }
+}
+
static void __init free_iommu_one(struct amd_iommu *iommu)
{
+ free_sysfs(iommu);
free_cwwb_sem(iommu);
free_command_buffer(iommu);
free_event_buffer(iommu);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 131/267] iommu: Return right value in iommu_sva_bind_device()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (129 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 130/267] iommu/amd: Fix sysfs leak in iommu init Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 132/267] io_uring/io-wq: Use set_bit() and test_bit() at worker->flags Greg Kroah-Hartman
` (147 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Lu Baolu, Jean-Philippe Brucker,
Kevin Tian, Vasant Hegde, Joerg Roedel, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lu Baolu <baolu.lu@linux.intel.com>
[ Upstream commit 89e8a2366e3bce584b6c01549d5019c5cda1205e ]
iommu_sva_bind_device() should return either a sva bond handle or an
ERR_PTR value in error cases. Existing drivers (idxd and uacce) only
check the return value with IS_ERR(). This could potentially lead to
a kernel NULL pointer dereference issue if the function returns NULL
instead of an error pointer.
In reality, this doesn't cause any problems because iommu_sva_bind_device()
only returns NULL when the kernel is not configured with CONFIG_IOMMU_SVA.
In this case, iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA) will
return an error, and the device drivers won't call iommu_sva_bind_device()
at all.
Fixes: 26b25a2b98e4 ("iommu: Bind process address spaces to devices")
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20240528042528.71396-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
include/linux/iommu.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 0225cf7445de2..b6ef263e85c06 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -1199,7 +1199,7 @@ u32 iommu_sva_get_pasid(struct iommu_sva *handle);
static inline struct iommu_sva *
iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
{
- return NULL;
+ return ERR_PTR(-ENODEV);
}
static inline void iommu_sva_unbind_device(struct iommu_sva *handle)
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 132/267] io_uring/io-wq: Use set_bit() and test_bit() at worker->flags
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (130 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 131/267] iommu: Return right value in iommu_sva_bind_device() Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 133/267] io_uring/io-wq: avoid garbage value of match in io_wq_enqueue() Greg Kroah-Hartman
` (146 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Breno Leitao, Jens Axboe,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Breno Leitao <leitao@debian.org>
[ Upstream commit 8a565304927fbd28c9f028c492b5c1714002cbab ]
Utilize set_bit() and test_bit() on worker->flags within io_uring/io-wq
to address potential data races.
The structure io_worker->flags may be accessed through various data
paths, leading to concurrency issues. When KCSAN is enabled, it reveals
data races occurring in io_worker_handle_work and
io_wq_activate_free_worker functions.
BUG: KCSAN: data-race in io_worker_handle_work / io_wq_activate_free_worker
write to 0xffff8885c4246404 of 4 bytes by task 49071 on cpu 28:
io_worker_handle_work (io_uring/io-wq.c:434 io_uring/io-wq.c:569)
io_wq_worker (io_uring/io-wq.c:?)
<snip>
read to 0xffff8885c4246404 of 4 bytes by task 49024 on cpu 5:
io_wq_activate_free_worker (io_uring/io-wq.c:? io_uring/io-wq.c:285)
io_wq_enqueue (io_uring/io-wq.c:947)
io_queue_iowq (io_uring/io_uring.c:524)
io_req_task_submit (io_uring/io_uring.c:1511)
io_handle_tw_list (io_uring/io_uring.c:1198)
<snip>
Line numbers against commit 18daea77cca6 ("Merge tag 'for-linus' of
git://git.kernel.org/pub/scm/virt/kvm/kvm").
These races involve writes and reads to the same memory location by
different tasks running on different CPUs. To mitigate this, refactor
the code to use atomic operations such as set_bit(), test_bit(), and
clear_bit() instead of basic "and" and "or" operations. This ensures
thread-safe manipulation of worker flags.
Also, move `create_index` to avoid holes in the structure.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240507170002.2269003-1-leitao@debian.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stable-dep-of: 91215f70ea85 ("io_uring/io-wq: avoid garbage value of 'match' in io_wq_enqueue()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
io_uring/io-wq.c | 47 ++++++++++++++++++++++++-----------------------
1 file changed, 24 insertions(+), 23 deletions(-)
diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c
index 318ed067dbf64..4a07742349048 100644
--- a/io_uring/io-wq.c
+++ b/io_uring/io-wq.c
@@ -25,10 +25,10 @@
#define WORKER_IDLE_TIMEOUT (5 * HZ)
enum {
- IO_WORKER_F_UP = 1, /* up and active */
- IO_WORKER_F_RUNNING = 2, /* account as running */
- IO_WORKER_F_FREE = 4, /* worker on free list */
- IO_WORKER_F_BOUND = 8, /* is doing bounded work */
+ IO_WORKER_F_UP = 0, /* up and active */
+ IO_WORKER_F_RUNNING = 1, /* account as running */
+ IO_WORKER_F_FREE = 2, /* worker on free list */
+ IO_WORKER_F_BOUND = 3, /* is doing bounded work */
};
enum {
@@ -44,7 +44,8 @@ enum {
*/
struct io_worker {
refcount_t ref;
- unsigned flags;
+ int create_index;
+ unsigned long flags;
struct hlist_nulls_node nulls_node;
struct list_head all_list;
struct task_struct *task;
@@ -58,7 +59,6 @@ struct io_worker {
unsigned long create_state;
struct callback_head create_work;
- int create_index;
union {
struct rcu_head rcu;
@@ -165,7 +165,7 @@ static inline struct io_wq_acct *io_work_get_acct(struct io_wq *wq,
static inline struct io_wq_acct *io_wq_get_acct(struct io_worker *worker)
{
- return io_get_acct(worker->wq, worker->flags & IO_WORKER_F_BOUND);
+ return io_get_acct(worker->wq, test_bit(IO_WORKER_F_BOUND, &worker->flags));
}
static void io_worker_ref_put(struct io_wq *wq)
@@ -225,7 +225,7 @@ static void io_worker_exit(struct io_worker *worker)
wait_for_completion(&worker->ref_done);
raw_spin_lock(&wq->lock);
- if (worker->flags & IO_WORKER_F_FREE)
+ if (test_bit(IO_WORKER_F_FREE, &worker->flags))
hlist_nulls_del_rcu(&worker->nulls_node);
list_del_rcu(&worker->all_list);
raw_spin_unlock(&wq->lock);
@@ -410,7 +410,7 @@ static void io_wq_dec_running(struct io_worker *worker)
struct io_wq_acct *acct = io_wq_get_acct(worker);
struct io_wq *wq = worker->wq;
- if (!(worker->flags & IO_WORKER_F_UP))
+ if (!test_bit(IO_WORKER_F_UP, &worker->flags))
return;
if (!atomic_dec_and_test(&acct->nr_running))
@@ -430,8 +430,8 @@ static void io_wq_dec_running(struct io_worker *worker)
*/
static void __io_worker_busy(struct io_wq *wq, struct io_worker *worker)
{
- if (worker->flags & IO_WORKER_F_FREE) {
- worker->flags &= ~IO_WORKER_F_FREE;
+ if (test_bit(IO_WORKER_F_FREE, &worker->flags)) {
+ clear_bit(IO_WORKER_F_FREE, &worker->flags);
raw_spin_lock(&wq->lock);
hlist_nulls_del_init_rcu(&worker->nulls_node);
raw_spin_unlock(&wq->lock);
@@ -444,8 +444,8 @@ static void __io_worker_busy(struct io_wq *wq, struct io_worker *worker)
static void __io_worker_idle(struct io_wq *wq, struct io_worker *worker)
__must_hold(wq->lock)
{
- if (!(worker->flags & IO_WORKER_F_FREE)) {
- worker->flags |= IO_WORKER_F_FREE;
+ if (!test_bit(IO_WORKER_F_FREE, &worker->flags)) {
+ set_bit(IO_WORKER_F_FREE, &worker->flags);
hlist_nulls_add_head_rcu(&worker->nulls_node, &wq->free_list);
}
}
@@ -634,7 +634,8 @@ static int io_wq_worker(void *data)
bool exit_mask = false, last_timeout = false;
char buf[TASK_COMM_LEN];
- worker->flags |= (IO_WORKER_F_UP | IO_WORKER_F_RUNNING);
+ set_mask_bits(&worker->flags, 0,
+ BIT(IO_WORKER_F_UP) | BIT(IO_WORKER_F_RUNNING));
snprintf(buf, sizeof(buf), "iou-wrk-%d", wq->task->pid);
set_task_comm(current, buf);
@@ -698,11 +699,11 @@ void io_wq_worker_running(struct task_struct *tsk)
if (!worker)
return;
- if (!(worker->flags & IO_WORKER_F_UP))
+ if (!test_bit(IO_WORKER_F_UP, &worker->flags))
return;
- if (worker->flags & IO_WORKER_F_RUNNING)
+ if (test_bit(IO_WORKER_F_RUNNING, &worker->flags))
return;
- worker->flags |= IO_WORKER_F_RUNNING;
+ set_bit(IO_WORKER_F_RUNNING, &worker->flags);
io_wq_inc_running(worker);
}
@@ -716,12 +717,12 @@ void io_wq_worker_sleeping(struct task_struct *tsk)
if (!worker)
return;
- if (!(worker->flags & IO_WORKER_F_UP))
+ if (!test_bit(IO_WORKER_F_UP, &worker->flags))
return;
- if (!(worker->flags & IO_WORKER_F_RUNNING))
+ if (!test_bit(IO_WORKER_F_RUNNING, &worker->flags))
return;
- worker->flags &= ~IO_WORKER_F_RUNNING;
+ clear_bit(IO_WORKER_F_RUNNING, &worker->flags);
io_wq_dec_running(worker);
}
@@ -735,7 +736,7 @@ static void io_init_new_worker(struct io_wq *wq, struct io_worker *worker,
raw_spin_lock(&wq->lock);
hlist_nulls_add_head_rcu(&worker->nulls_node, &wq->free_list);
list_add_tail_rcu(&worker->all_list, &wq->all_list);
- worker->flags |= IO_WORKER_F_FREE;
+ set_bit(IO_WORKER_F_FREE, &worker->flags);
raw_spin_unlock(&wq->lock);
wake_up_new_task(tsk);
}
@@ -841,7 +842,7 @@ static bool create_io_worker(struct io_wq *wq, int index)
init_completion(&worker->ref_done);
if (index == IO_WQ_ACCT_BOUND)
- worker->flags |= IO_WORKER_F_BOUND;
+ set_bit(IO_WORKER_F_BOUND, &worker->flags);
tsk = create_io_thread(io_wq_worker, worker, NUMA_NO_NODE);
if (!IS_ERR(tsk)) {
@@ -927,8 +928,8 @@ static bool io_wq_work_match_item(struct io_wq_work *work, void *data)
void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work)
{
struct io_wq_acct *acct = io_work_get_acct(wq, work);
+ unsigned long work_flags = work->flags;
struct io_cb_cancel_data match;
- unsigned work_flags = work->flags;
bool do_create;
/*
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 133/267] io_uring/io-wq: avoid garbage value of match in io_wq_enqueue()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (131 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 132/267] io_uring/io-wq: Use set_bit() and test_bit() at worker->flags Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 134/267] HID: logitech-dj: Fix memory leak in logi_dj_recv_switch_to_dj_mode() Greg Kroah-Hartman
` (145 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Su Hui, Jens Axboe, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Su Hui <suhui@nfschina.com>
[ Upstream commit 91215f70ea8541e9011c0b48f8b59b9e0ce6953b ]
Clang static checker (scan-build) warning:
o_uring/io-wq.c:line 1051, column 3
The expression is an uninitialized value. The computed value will
also be garbage.
'match.nr_pending' is used in io_acct_cancel_pending_work(), but it is
not fully initialized. Change the order of assignment for 'match' to fix
this problem.
Fixes: 42abc95f05bf ("io-wq: decouple work_list protection from the big wqe->lock")
Signed-off-by: Su Hui <suhui@nfschina.com>
Link: https://lore.kernel.org/r/20240604121242.2661244-1-suhui@nfschina.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
io_uring/io-wq.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c
index 4a07742349048..8a99aabcac2c3 100644
--- a/io_uring/io-wq.c
+++ b/io_uring/io-wq.c
@@ -929,7 +929,11 @@ void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work)
{
struct io_wq_acct *acct = io_work_get_acct(wq, work);
unsigned long work_flags = work->flags;
- struct io_cb_cancel_data match;
+ struct io_cb_cancel_data match = {
+ .fn = io_wq_work_match_item,
+ .data = work,
+ .cancel_all = false,
+ };
bool do_create;
/*
@@ -967,10 +971,6 @@ void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work)
raw_spin_unlock(&wq->lock);
/* fatal condition, failed to create the first worker */
- match.fn = io_wq_work_match_item,
- match.data = work,
- match.cancel_all = false,
-
io_acct_cancel_pending_work(wq, acct, &match);
}
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 134/267] HID: logitech-dj: Fix memory leak in logi_dj_recv_switch_to_dj_mode()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (132 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 133/267] io_uring/io-wq: avoid garbage value of match in io_wq_enqueue() Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 135/267] drm/vmwgfx: Refactor drm connector probing for display modes Greg Kroah-Hartman
` (144 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, José Expósito, Jiri Kosina,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: José Expósito <jose.exposito89@gmail.com>
[ Upstream commit ce3af2ee95170b7d9e15fff6e500d67deab1e7b3 ]
Fix a memory leak on logi_dj_recv_send_report() error path.
Fixes: 6f20d3261265 ("HID: logitech-dj: Fix error handling in logi_dj_recv_switch_to_dj_mode()")
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/hid/hid-logitech-dj.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/hid/hid-logitech-dj.c b/drivers/hid/hid-logitech-dj.c
index 3c3c497b6b911..37958edec55f5 100644
--- a/drivers/hid/hid-logitech-dj.c
+++ b/drivers/hid/hid-logitech-dj.c
@@ -1284,8 +1284,10 @@ static int logi_dj_recv_switch_to_dj_mode(struct dj_receiver_dev *djrcv_dev,
*/
msleep(50);
- if (retval)
+ if (retval) {
+ kfree(dj_report);
return retval;
+ }
}
/*
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 135/267] drm/vmwgfx: Refactor drm connector probing for display modes
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (133 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 134/267] HID: logitech-dj: Fix memory leak in logi_dj_recv_switch_to_dj_mode() Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 136/267] drm/vmwgfx: Filter modes which exceed graphics memory Greg Kroah-Hartman
` (143 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Martin Krastev, Zack Rusin,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Martin Krastev <martin.krastev@broadcom.com>
[ Upstream commit 935f795045a6f9b13d28d46ebdad04bfea8750dd ]
Implement drm_connector_helper_funcs.mode_valid and .get_modes,
replacing custom drm_connector_funcs.fill_modes code with
drm_helper_probe_single_connector_modes; for STDU, LDU & SOU
display units.
Signed-off-by: Martin Krastev <martin.krastev@broadcom.com>
Reviewed-by: Zack Rusin <zack.rusin@broadcom.com>
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240126200804.732454-2-zack.rusin@broadcom.com
Stable-dep-of: 426826933109 ("drm/vmwgfx: Filter modes which exceed graphics memory")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 272 +++++++++------------------
drivers/gpu/drm/vmwgfx/vmwgfx_kms.h | 6 +-
drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c | 5 +-
drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c | 5 +-
drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c | 4 +-
5 files changed, 101 insertions(+), 191 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
index a884072851322..59de170a31853 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
@@ -35,6 +35,7 @@
#include <drm/drm_fourcc.h>
#include <drm/drm_rect.h>
#include <drm/drm_sysfs.h>
+#include <drm/drm_edid.h>
void vmw_du_cleanup(struct vmw_display_unit *du)
{
@@ -2279,107 +2280,6 @@ vmw_du_connector_detect(struct drm_connector *connector, bool force)
connector_status_connected : connector_status_disconnected);
}
-static struct drm_display_mode vmw_kms_connector_builtin[] = {
- /* 640x480@60Hz */
- { DRM_MODE("640x480", DRM_MODE_TYPE_DRIVER, 25175, 640, 656,
- 752, 800, 0, 480, 489, 492, 525, 0,
- DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_NVSYNC) },
- /* 800x600@60Hz */
- { DRM_MODE("800x600", DRM_MODE_TYPE_DRIVER, 40000, 800, 840,
- 968, 1056, 0, 600, 601, 605, 628, 0,
- DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC) },
- /* 1024x768@60Hz */
- { DRM_MODE("1024x768", DRM_MODE_TYPE_DRIVER, 65000, 1024, 1048,
- 1184, 1344, 0, 768, 771, 777, 806, 0,
- DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_NVSYNC) },
- /* 1152x864@75Hz */
- { DRM_MODE("1152x864", DRM_MODE_TYPE_DRIVER, 108000, 1152, 1216,
- 1344, 1600, 0, 864, 865, 868, 900, 0,
- DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC) },
- /* 1280x720@60Hz */
- { DRM_MODE("1280x720", DRM_MODE_TYPE_DRIVER, 74500, 1280, 1344,
- 1472, 1664, 0, 720, 723, 728, 748, 0,
- DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) },
- /* 1280x768@60Hz */
- { DRM_MODE("1280x768", DRM_MODE_TYPE_DRIVER, 79500, 1280, 1344,
- 1472, 1664, 0, 768, 771, 778, 798, 0,
- DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) },
- /* 1280x800@60Hz */
- { DRM_MODE("1280x800", DRM_MODE_TYPE_DRIVER, 83500, 1280, 1352,
- 1480, 1680, 0, 800, 803, 809, 831, 0,
- DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_NVSYNC) },
- /* 1280x960@60Hz */
- { DRM_MODE("1280x960", DRM_MODE_TYPE_DRIVER, 108000, 1280, 1376,
- 1488, 1800, 0, 960, 961, 964, 1000, 0,
- DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC) },
- /* 1280x1024@60Hz */
- { DRM_MODE("1280x1024", DRM_MODE_TYPE_DRIVER, 108000, 1280, 1328,
- 1440, 1688, 0, 1024, 1025, 1028, 1066, 0,
- DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC) },
- /* 1360x768@60Hz */
- { DRM_MODE("1360x768", DRM_MODE_TYPE_DRIVER, 85500, 1360, 1424,
- 1536, 1792, 0, 768, 771, 777, 795, 0,
- DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC) },
- /* 1440x1050@60Hz */
- { DRM_MODE("1400x1050", DRM_MODE_TYPE_DRIVER, 121750, 1400, 1488,
- 1632, 1864, 0, 1050, 1053, 1057, 1089, 0,
- DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) },
- /* 1440x900@60Hz */
- { DRM_MODE("1440x900", DRM_MODE_TYPE_DRIVER, 106500, 1440, 1520,
- 1672, 1904, 0, 900, 903, 909, 934, 0,
- DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) },
- /* 1600x1200@60Hz */
- { DRM_MODE("1600x1200", DRM_MODE_TYPE_DRIVER, 162000, 1600, 1664,
- 1856, 2160, 0, 1200, 1201, 1204, 1250, 0,
- DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC) },
- /* 1680x1050@60Hz */
- { DRM_MODE("1680x1050", DRM_MODE_TYPE_DRIVER, 146250, 1680, 1784,
- 1960, 2240, 0, 1050, 1053, 1059, 1089, 0,
- DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) },
- /* 1792x1344@60Hz */
- { DRM_MODE("1792x1344", DRM_MODE_TYPE_DRIVER, 204750, 1792, 1920,
- 2120, 2448, 0, 1344, 1345, 1348, 1394, 0,
- DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) },
- /* 1853x1392@60Hz */
- { DRM_MODE("1856x1392", DRM_MODE_TYPE_DRIVER, 218250, 1856, 1952,
- 2176, 2528, 0, 1392, 1393, 1396, 1439, 0,
- DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) },
- /* 1920x1080@60Hz */
- { DRM_MODE("1920x1080", DRM_MODE_TYPE_DRIVER, 173000, 1920, 2048,
- 2248, 2576, 0, 1080, 1083, 1088, 1120, 0,
- DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) },
- /* 1920x1200@60Hz */
- { DRM_MODE("1920x1200", DRM_MODE_TYPE_DRIVER, 193250, 1920, 2056,
- 2256, 2592, 0, 1200, 1203, 1209, 1245, 0,
- DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) },
- /* 1920x1440@60Hz */
- { DRM_MODE("1920x1440", DRM_MODE_TYPE_DRIVER, 234000, 1920, 2048,
- 2256, 2600, 0, 1440, 1441, 1444, 1500, 0,
- DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) },
- /* 2560x1440@60Hz */
- { DRM_MODE("2560x1440", DRM_MODE_TYPE_DRIVER, 241500, 2560, 2608,
- 2640, 2720, 0, 1440, 1443, 1448, 1481, 0,
- DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_NVSYNC) },
- /* 2560x1600@60Hz */
- { DRM_MODE("2560x1600", DRM_MODE_TYPE_DRIVER, 348500, 2560, 2752,
- 3032, 3504, 0, 1600, 1603, 1609, 1658, 0,
- DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC) },
- /* 2880x1800@60Hz */
- { DRM_MODE("2880x1800", DRM_MODE_TYPE_DRIVER, 337500, 2880, 2928,
- 2960, 3040, 0, 1800, 1803, 1809, 1852, 0,
- DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_NVSYNC) },
- /* 3840x2160@60Hz */
- { DRM_MODE("3840x2160", DRM_MODE_TYPE_DRIVER, 533000, 3840, 3888,
- 3920, 4000, 0, 2160, 2163, 2168, 2222, 0,
- DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_NVSYNC) },
- /* 3840x2400@60Hz */
- { DRM_MODE("3840x2400", DRM_MODE_TYPE_DRIVER, 592250, 3840, 3888,
- 3920, 4000, 0, 2400, 2403, 2409, 2469, 0,
- DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_NVSYNC) },
- /* Terminate */
- { DRM_MODE("", 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) },
-};
-
/**
* vmw_guess_mode_timing - Provide fake timings for a
* 60Hz vrefresh mode.
@@ -2401,88 +2301,6 @@ void vmw_guess_mode_timing(struct drm_display_mode *mode)
}
-int vmw_du_connector_fill_modes(struct drm_connector *connector,
- uint32_t max_width, uint32_t max_height)
-{
- struct vmw_display_unit *du = vmw_connector_to_du(connector);
- struct drm_device *dev = connector->dev;
- struct vmw_private *dev_priv = vmw_priv(dev);
- struct drm_display_mode *mode = NULL;
- struct drm_display_mode *bmode;
- struct drm_display_mode prefmode = { DRM_MODE("preferred",
- DRM_MODE_TYPE_DRIVER | DRM_MODE_TYPE_PREFERRED,
- 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
- DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC)
- };
- int i;
- u32 assumed_bpp = 4;
-
- if (dev_priv->assume_16bpp)
- assumed_bpp = 2;
-
- max_width = min(max_width, dev_priv->texture_max_width);
- max_height = min(max_height, dev_priv->texture_max_height);
-
- /*
- * For STDU extra limit for a mode on SVGA_REG_SCREENTARGET_MAX_WIDTH/
- * HEIGHT registers.
- */
- if (dev_priv->active_display_unit == vmw_du_screen_target) {
- max_width = min(max_width, dev_priv->stdu_max_width);
- max_height = min(max_height, dev_priv->stdu_max_height);
- }
-
- /* Add preferred mode */
- mode = drm_mode_duplicate(dev, &prefmode);
- if (!mode)
- return 0;
- mode->hdisplay = du->pref_width;
- mode->vdisplay = du->pref_height;
- vmw_guess_mode_timing(mode);
- drm_mode_set_name(mode);
-
- if (vmw_kms_validate_mode_vram(dev_priv,
- mode->hdisplay * assumed_bpp,
- mode->vdisplay)) {
- drm_mode_probed_add(connector, mode);
- } else {
- drm_mode_destroy(dev, mode);
- mode = NULL;
- }
-
- if (du->pref_mode) {
- list_del_init(&du->pref_mode->head);
- drm_mode_destroy(dev, du->pref_mode);
- }
-
- /* mode might be null here, this is intended */
- du->pref_mode = mode;
-
- for (i = 0; vmw_kms_connector_builtin[i].type != 0; i++) {
- bmode = &vmw_kms_connector_builtin[i];
- if (bmode->hdisplay > max_width ||
- bmode->vdisplay > max_height)
- continue;
-
- if (!vmw_kms_validate_mode_vram(dev_priv,
- bmode->hdisplay * assumed_bpp,
- bmode->vdisplay))
- continue;
-
- mode = drm_mode_duplicate(dev, bmode);
- if (!mode)
- return 0;
-
- drm_mode_probed_add(connector, mode);
- }
-
- drm_connector_list_update(connector);
- /* Move the prefered mode first, help apps pick the right mode. */
- drm_mode_sort(&connector->modes);
-
- return 1;
-}
-
/**
* vmw_kms_update_layout_ioctl - Handler for DRM_VMW_UPDATE_LAYOUT ioctl
* @dev: drm device for the ioctl
@@ -3023,3 +2841,91 @@ int vmw_du_helper_plane_update(struct vmw_du_update_plane *update)
vmw_validation_unref_lists(&val_ctx);
return ret;
}
+
+/**
+ * vmw_connector_mode_valid - implements drm_connector_helper_funcs.mode_valid callback
+ *
+ * @connector: the drm connector, part of a DU container
+ * @mode: drm mode to check
+ *
+ * Returns MODE_OK on success, or a drm_mode_status error code.
+ */
+enum drm_mode_status vmw_connector_mode_valid(struct drm_connector *connector,
+ struct drm_display_mode *mode)
+{
+ struct drm_device *dev = connector->dev;
+ struct vmw_private *dev_priv = vmw_priv(dev);
+ u32 max_width = dev_priv->texture_max_width;
+ u32 max_height = dev_priv->texture_max_height;
+ u32 assumed_cpp = 4;
+
+ if (dev_priv->assume_16bpp)
+ assumed_cpp = 2;
+
+ if (dev_priv->active_display_unit == vmw_du_screen_target) {
+ max_width = min(dev_priv->stdu_max_width, max_width);
+ max_height = min(dev_priv->stdu_max_height, max_height);
+ }
+
+ if (max_width < mode->hdisplay)
+ return MODE_BAD_HVALUE;
+
+ if (max_height < mode->vdisplay)
+ return MODE_BAD_VVALUE;
+
+ if (!vmw_kms_validate_mode_vram(dev_priv,
+ mode->hdisplay * assumed_cpp,
+ mode->vdisplay))
+ return MODE_MEM;
+
+ return MODE_OK;
+}
+
+/**
+ * vmw_connector_get_modes - implements drm_connector_helper_funcs.get_modes callback
+ *
+ * @connector: the drm connector, part of a DU container
+ *
+ * Returns the number of added modes.
+ */
+int vmw_connector_get_modes(struct drm_connector *connector)
+{
+ struct vmw_display_unit *du = vmw_connector_to_du(connector);
+ struct drm_device *dev = connector->dev;
+ struct vmw_private *dev_priv = vmw_priv(dev);
+ struct drm_display_mode *mode = NULL;
+ struct drm_display_mode prefmode = { DRM_MODE("preferred",
+ DRM_MODE_TYPE_DRIVER | DRM_MODE_TYPE_PREFERRED,
+ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+ DRM_MODE_FLAG_NHSYNC | DRM_MODE_FLAG_PVSYNC)
+ };
+ u32 max_width;
+ u32 max_height;
+ u32 num_modes;
+
+ /* Add preferred mode */
+ mode = drm_mode_duplicate(dev, &prefmode);
+ if (!mode)
+ return 0;
+
+ mode->hdisplay = du->pref_width;
+ mode->vdisplay = du->pref_height;
+ vmw_guess_mode_timing(mode);
+ drm_mode_set_name(mode);
+
+ drm_mode_probed_add(connector, mode);
+ drm_dbg_kms(dev, "preferred mode " DRM_MODE_FMT "\n", DRM_MODE_ARG(mode));
+
+ /* Probe connector for all modes not exceeding our geom limits */
+ max_width = dev_priv->texture_max_width;
+ max_height = dev_priv->texture_max_height;
+
+ if (dev_priv->active_display_unit == vmw_du_screen_target) {
+ max_width = min(dev_priv->stdu_max_width, max_width);
+ max_height = min(dev_priv->stdu_max_height, max_height);
+ }
+
+ num_modes = 1 + drm_add_modes_noedid(connector, max_width, max_height);
+
+ return num_modes;
+}
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
index 9fda4f4ec7a97..19a843da87b78 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
@@ -378,7 +378,6 @@ struct vmw_display_unit {
unsigned pref_width;
unsigned pref_height;
bool pref_active;
- struct drm_display_mode *pref_mode;
/*
* Gui positioning
@@ -428,8 +427,6 @@ void vmw_du_connector_save(struct drm_connector *connector);
void vmw_du_connector_restore(struct drm_connector *connector);
enum drm_connector_status
vmw_du_connector_detect(struct drm_connector *connector, bool force);
-int vmw_du_connector_fill_modes(struct drm_connector *connector,
- uint32_t max_width, uint32_t max_height);
int vmw_kms_helper_dirty(struct vmw_private *dev_priv,
struct vmw_framebuffer *framebuffer,
const struct drm_clip_rect *clips,
@@ -438,6 +435,9 @@ int vmw_kms_helper_dirty(struct vmw_private *dev_priv,
int num_clips,
int increment,
struct vmw_kms_dirty *dirty);
+enum drm_mode_status vmw_connector_mode_valid(struct drm_connector *connector,
+ struct drm_display_mode *mode);
+int vmw_connector_get_modes(struct drm_connector *connector);
void vmw_kms_helper_validation_finish(struct vmw_private *dev_priv,
struct drm_file *file_priv,
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c b/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c
index a82fa97003705..c4db4aecca6c3 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c
@@ -304,7 +304,7 @@ static void vmw_ldu_connector_destroy(struct drm_connector *connector)
static const struct drm_connector_funcs vmw_legacy_connector_funcs = {
.dpms = vmw_du_connector_dpms,
.detect = vmw_du_connector_detect,
- .fill_modes = vmw_du_connector_fill_modes,
+ .fill_modes = drm_helper_probe_single_connector_modes,
.destroy = vmw_ldu_connector_destroy,
.reset = vmw_du_connector_reset,
.atomic_duplicate_state = vmw_du_connector_duplicate_state,
@@ -313,6 +313,8 @@ static const struct drm_connector_funcs vmw_legacy_connector_funcs = {
static const struct
drm_connector_helper_funcs vmw_ldu_connector_helper_funcs = {
+ .get_modes = vmw_connector_get_modes,
+ .mode_valid = vmw_connector_mode_valid
};
static int vmw_kms_ldu_do_bo_dirty(struct vmw_private *dev_priv,
@@ -449,7 +451,6 @@ static int vmw_ldu_init(struct vmw_private *dev_priv, unsigned unit)
ldu->base.pref_active = (unit == 0);
ldu->base.pref_width = dev_priv->initial_width;
ldu->base.pref_height = dev_priv->initial_height;
- ldu->base.pref_mode = NULL;
/*
* Remove this after enabling atomic because property values can
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c b/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c
index 556a403b7eb56..30c3ad27b6629 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c
@@ -347,7 +347,7 @@ static void vmw_sou_connector_destroy(struct drm_connector *connector)
static const struct drm_connector_funcs vmw_sou_connector_funcs = {
.dpms = vmw_du_connector_dpms,
.detect = vmw_du_connector_detect,
- .fill_modes = vmw_du_connector_fill_modes,
+ .fill_modes = drm_helper_probe_single_connector_modes,
.destroy = vmw_sou_connector_destroy,
.reset = vmw_du_connector_reset,
.atomic_duplicate_state = vmw_du_connector_duplicate_state,
@@ -357,6 +357,8 @@ static const struct drm_connector_funcs vmw_sou_connector_funcs = {
static const struct
drm_connector_helper_funcs vmw_sou_connector_helper_funcs = {
+ .get_modes = vmw_connector_get_modes,
+ .mode_valid = vmw_connector_mode_valid
};
@@ -826,7 +828,6 @@ static int vmw_sou_init(struct vmw_private *dev_priv, unsigned unit)
sou->base.pref_active = (unit == 0);
sou->base.pref_width = dev_priv->initial_width;
sou->base.pref_height = dev_priv->initial_height;
- sou->base.pref_mode = NULL;
/*
* Remove this after enabling atomic because property values can
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
index ba0c0e12cfe9d..12d623ee59c25 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
@@ -830,7 +830,7 @@ static void vmw_stdu_connector_destroy(struct drm_connector *connector)
static const struct drm_connector_funcs vmw_stdu_connector_funcs = {
.dpms = vmw_du_connector_dpms,
.detect = vmw_du_connector_detect,
- .fill_modes = vmw_du_connector_fill_modes,
+ .fill_modes = drm_helper_probe_single_connector_modes,
.destroy = vmw_stdu_connector_destroy,
.reset = vmw_du_connector_reset,
.atomic_duplicate_state = vmw_du_connector_duplicate_state,
@@ -840,6 +840,8 @@ static const struct drm_connector_funcs vmw_stdu_connector_funcs = {
static const struct
drm_connector_helper_funcs vmw_stdu_connector_helper_funcs = {
+ .get_modes = vmw_connector_get_modes,
+ .mode_valid = vmw_connector_mode_valid
};
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 136/267] drm/vmwgfx: Filter modes which exceed graphics memory
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (134 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 135/267] drm/vmwgfx: Refactor drm connector probing for display modes Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 137/267] drm/vmwgfx: 3D disabled should not effect STDU memory limits Greg Kroah-Hartman
` (142 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Ian Forbes, Zack Rusin, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Forbes <ian.forbes@broadcom.com>
[ Upstream commit 426826933109093503e7ef15d49348fc5ab505fe ]
SVGA requires individual surfaces to fit within graphics memory
(max_mob_pages) which means that modes with a final buffer size that would
exceed graphics memory must be pruned otherwise creation will fail.
Additionally llvmpipe requires its buffer height and width to be a multiple
of its tile size which is 64. As a result we have to anticipate that
llvmpipe will round up the mode size passed to it by the compositor when
it creates buffers and filter modes where this rounding exceeds graphics
memory.
This fixes an issue where VMs with low graphics memory (< 64MiB) configured
with high resolution mode boot to a black screen because surface creation
fails.
Fixes: d947d1b71deb ("drm/vmwgfx: Add and connect connector helper function")
Signed-off-by: Ian Forbes <ian.forbes@broadcom.com>
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240521184720.767-2-ian.forbes@broadcom.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c | 45 ++++++++++++++++++++++++++--
1 file changed, 43 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
index 12d623ee59c25..4ccab07faff08 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
@@ -41,7 +41,14 @@
#define vmw_connector_to_stdu(x) \
container_of(x, struct vmw_screen_target_display_unit, base.connector)
-
+/*
+ * Some renderers such as llvmpipe will align the width and height of their
+ * buffers to match their tile size. We need to keep this in mind when exposing
+ * modes to userspace so that this possible over-allocation will not exceed
+ * graphics memory. 64x64 pixels seems to be a reasonable upper bound for the
+ * tile size of current renderers.
+ */
+#define GPU_TILE_SIZE 64
enum stdu_content_type {
SAME_AS_DISPLAY = 0,
@@ -825,7 +832,41 @@ static void vmw_stdu_connector_destroy(struct drm_connector *connector)
vmw_stdu_destroy(vmw_connector_to_stdu(connector));
}
+static enum drm_mode_status
+vmw_stdu_connector_mode_valid(struct drm_connector *connector,
+ struct drm_display_mode *mode)
+{
+ enum drm_mode_status ret;
+ struct drm_device *dev = connector->dev;
+ struct vmw_private *dev_priv = vmw_priv(dev);
+ u64 assumed_cpp = dev_priv->assume_16bpp ? 2 : 4;
+ /* Align width and height to account for GPU tile over-alignment */
+ u64 required_mem = ALIGN(mode->hdisplay, GPU_TILE_SIZE) *
+ ALIGN(mode->vdisplay, GPU_TILE_SIZE) *
+ assumed_cpp;
+ required_mem = ALIGN(required_mem, PAGE_SIZE);
+
+ ret = drm_mode_validate_size(mode, dev_priv->stdu_max_width,
+ dev_priv->stdu_max_height);
+ if (ret != MODE_OK)
+ return ret;
+ ret = drm_mode_validate_size(mode, dev_priv->texture_max_width,
+ dev_priv->texture_max_height);
+ if (ret != MODE_OK)
+ return ret;
+
+ if (required_mem > dev_priv->max_primary_mem)
+ return MODE_MEM;
+
+ if (required_mem > dev_priv->max_mob_pages * PAGE_SIZE)
+ return MODE_MEM;
+
+ if (required_mem > dev_priv->max_mob_size)
+ return MODE_MEM;
+
+ return MODE_OK;
+}
static const struct drm_connector_funcs vmw_stdu_connector_funcs = {
.dpms = vmw_du_connector_dpms,
@@ -841,7 +882,7 @@ static const struct drm_connector_funcs vmw_stdu_connector_funcs = {
static const struct
drm_connector_helper_funcs vmw_stdu_connector_helper_funcs = {
.get_modes = vmw_connector_get_modes,
- .mode_valid = vmw_connector_mode_valid
+ .mode_valid = vmw_stdu_connector_mode_valid
};
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 137/267] drm/vmwgfx: 3D disabled should not effect STDU memory limits
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (135 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 136/267] drm/vmwgfx: Filter modes which exceed graphics memory Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 138/267] drm/vmwgfx: Remove STDU logic from generic mode_valid function Greg Kroah-Hartman
` (141 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Ian Forbes, Zack Rusin, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Forbes <ian.forbes@broadcom.com>
[ Upstream commit fb5e19d2dd03eb995ccd468d599b2337f7f66555 ]
This limit became a hard cap starting with the change referenced below.
Surface creation on the device will fail if the requested size is larger
than this limit so altering the value arbitrarily will expose modes that
are too large for the device's hard limits.
Fixes: 7ebb47c9f9ab ("drm/vmwgfx: Read new register for GB memory when available")
Signed-off-by: Ian Forbes <ian.forbes@broadcom.com>
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240521184720.767-3-ian.forbes@broadcom.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
index 58fb40c93100a..bea576434e475 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
@@ -956,13 +956,6 @@ static int vmw_driver_load(struct vmw_private *dev_priv, u32 pci_id)
vmw_read(dev_priv,
SVGA_REG_SUGGESTED_GBOBJECT_MEM_SIZE_KB);
- /*
- * Workaround for low memory 2D VMs to compensate for the
- * allocation taken by fbdev
- */
- if (!(dev_priv->capabilities & SVGA_CAP_3D))
- mem_size *= 3;
-
dev_priv->max_mob_pages = mem_size * 1024 / PAGE_SIZE;
dev_priv->max_primary_mem =
vmw_read(dev_priv, SVGA_REG_MAX_PRIMARY_MEM);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 138/267] drm/vmwgfx: Remove STDU logic from generic mode_valid function
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (136 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 137/267] drm/vmwgfx: 3D disabled should not effect STDU memory limits Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 139/267] drm/vmwgfx: Dont memcmp equivalent pointers Greg Kroah-Hartman
` (140 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Ian Forbes, Zack Rusin, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Forbes <ian.forbes@broadcom.com>
[ Upstream commit dde1de06bd7248fd83c4ce5cf0dbe9e4e95bbb91 ]
STDU has its own mode_valid function now so this logic can be removed from
the generic version.
Fixes: 935f795045a6 ("drm/vmwgfx: Refactor drm connector probing for display modes")
Signed-off-by: Ian Forbes <ian.forbes@broadcom.com>
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240521184720.767-4-ian.forbes@broadcom.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpu/drm/vmwgfx/vmwgfx_drv.h | 3 ---
drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 26 +++++++++-----------------
2 files changed, 9 insertions(+), 20 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
index 6acc7ad0e9eb8..13423c7b0cbdb 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
@@ -1067,9 +1067,6 @@ void vmw_kms_cursor_snoop(struct vmw_surface *srf,
int vmw_kms_write_svga(struct vmw_private *vmw_priv,
unsigned width, unsigned height, unsigned pitch,
unsigned bpp, unsigned depth);
-bool vmw_kms_validate_mode_vram(struct vmw_private *dev_priv,
- uint32_t pitch,
- uint32_t height);
int vmw_kms_present(struct vmw_private *dev_priv,
struct drm_file *file_priv,
struct vmw_framebuffer *vfb,
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
index 59de170a31853..93e2a27daed0c 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
@@ -2151,13 +2151,12 @@ int vmw_kms_write_svga(struct vmw_private *vmw_priv,
return 0;
}
+static
bool vmw_kms_validate_mode_vram(struct vmw_private *dev_priv,
- uint32_t pitch,
- uint32_t height)
+ u64 pitch,
+ u64 height)
{
- return ((u64) pitch * (u64) height) < (u64)
- ((dev_priv->active_display_unit == vmw_du_screen_target) ?
- dev_priv->max_primary_mem : dev_priv->vram_size);
+ return (pitch * height) < (u64)dev_priv->vram_size;
}
/**
@@ -2853,25 +2852,18 @@ int vmw_du_helper_plane_update(struct vmw_du_update_plane *update)
enum drm_mode_status vmw_connector_mode_valid(struct drm_connector *connector,
struct drm_display_mode *mode)
{
+ enum drm_mode_status ret;
struct drm_device *dev = connector->dev;
struct vmw_private *dev_priv = vmw_priv(dev);
- u32 max_width = dev_priv->texture_max_width;
- u32 max_height = dev_priv->texture_max_height;
u32 assumed_cpp = 4;
if (dev_priv->assume_16bpp)
assumed_cpp = 2;
- if (dev_priv->active_display_unit == vmw_du_screen_target) {
- max_width = min(dev_priv->stdu_max_width, max_width);
- max_height = min(dev_priv->stdu_max_height, max_height);
- }
-
- if (max_width < mode->hdisplay)
- return MODE_BAD_HVALUE;
-
- if (max_height < mode->vdisplay)
- return MODE_BAD_VVALUE;
+ ret = drm_mode_validate_size(mode, dev_priv->texture_max_width,
+ dev_priv->texture_max_height);
+ if (ret != MODE_OK)
+ return ret;
if (!vmw_kms_validate_mode_vram(dev_priv,
mode->hdisplay * assumed_cpp,
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 139/267] drm/vmwgfx: Dont memcmp equivalent pointers
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (137 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 138/267] drm/vmwgfx: Remove STDU logic from generic mode_valid function Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 140/267] af_unix: Annotate data-race of sk->sk_state in unix_accept() Greg Kroah-Hartman
` (139 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Ian Forbes, Zack Rusin, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Forbes <ian.forbes@broadcom.com>
[ Upstream commit 5703fc058efdafcdd6b70776ee562478f0753acb ]
These pointers are frequently the same and memcmp does not compare the
pointers before comparing their contents so this was wasting cycles
comparing 16 KiB of memory which will always be equal.
Fixes: bb6780aa5a1d ("drm/vmwgfx: Diff cursors when using cmds")
Signed-off-by: Ian Forbes <ian.forbes@broadcom.com>
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240328190716.27367-1-ian.forbes@broadcom.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
index 93e2a27daed0c..08f2470edab27 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
@@ -216,7 +216,7 @@ static bool vmw_du_cursor_plane_has_changed(struct vmw_plane_state *old_vps,
new_image = vmw_du_cursor_plane_acquire_image(new_vps);
changed = false;
- if (old_image && new_image)
+ if (old_image && new_image && old_image != new_image)
changed = memcmp(old_image, new_image, size) != 0;
return changed;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 140/267] af_unix: Annotate data-race of sk->sk_state in unix_accept().
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (138 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 139/267] drm/vmwgfx: Dont memcmp equivalent pointers Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 141/267] modpost: do not warn about missing MODULE_DESCRIPTION() for vmlinux.o Greg Kroah-Hartman
` (138 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Kuniyuki Iwashima, Paolo Abeni,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima <kuniyu@amazon.com>
[ Upstream commit 1b536948e805aab61a48c5aa5db10c9afee880bd ]
Once sk->sk_state is changed to TCP_LISTEN, it never changes.
unix_accept() takes the advantage and reads sk->sk_state without
holding unix_state_lock().
Let's use READ_ONCE() there.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/unix/af_unix.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1710,7 +1710,7 @@ static int unix_accept(struct socket *so
goto out;
err = -EINVAL;
- if (sk->sk_state != TCP_LISTEN)
+ if (READ_ONCE(sk->sk_state) != TCP_LISTEN)
goto out;
/* If socket state is TCP_LISTEN it cannot change (for now...),
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 141/267] modpost: do not warn about missing MODULE_DESCRIPTION() for vmlinux.o
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (139 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 140/267] af_unix: Annotate data-race of sk->sk_state in unix_accept() Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 142/267] net: sfp: Always call `sfp_sm_mod_remove()` on remove Greg Kroah-Hartman
` (137 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Masahiro Yamada, Vincenzo Palazzo,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Masahiro Yamada <masahiroy@kernel.org>
[ Upstream commit 9185afeac2a3dcce8300a5684291a43c2838cfd6 ]
Building with W=1 incorrectly emits the following warning:
WARNING: modpost: missing MODULE_DESCRIPTION() in vmlinux.o
This check should apply only to modules.
Fixes: 1fffe7a34c89 ("script: modpost: emit a warning when the description is missing")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
scripts/mod/modpost.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index 269bd79bcd9ad..828d5cc367169 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -1684,10 +1684,11 @@ static void read_symbols(const char *modname)
namespace = get_next_modinfo(&info, "import_ns",
namespace);
}
+
+ if (extra_warn && !get_modinfo(&info, "description"))
+ warn("missing MODULE_DESCRIPTION() in %s\n", modname);
}
- if (extra_warn && !get_modinfo(&info, "description"))
- warn("missing MODULE_DESCRIPTION() in %s\n", modname);
for (sym = info.symtab_start; sym < info.symtab_stop; sym++) {
symname = remove_dot(info.strtab + sym->st_name);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 142/267] net: sfp: Always call `sfp_sm_mod_remove()` on remove
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (140 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 141/267] modpost: do not warn about missing MODULE_DESCRIPTION() for vmlinux.o Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 143/267] net: hns3: fix kernel crash problem in concurrent scenario Greg Kroah-Hartman
` (136 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, =20Bence?=, Andrew Lunn,
Jakub Kicinski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Csókás, Bence <csokas.bence@prolan.hu>
[ Upstream commit e96b2933152fd87b6a41765b2f58b158fde855b6 ]
If the module is in SFP_MOD_ERROR, `sfp_sm_mod_remove()` will
not be run. As a consequence, `sfp_hwmon_remove()` is not getting
run either, leaving a stale `hwmon` device behind. `sfp_sm_mod_remove()`
itself checks `sfp->sm_mod_state` anyways, so this check was not
really needed in the first place.
Fixes: d2e816c0293f ("net: sfp: handle module remove outside state machine")
Signed-off-by: "Csókás, Bence" <csokas.bence@prolan.hu>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240605084251.63502-1-csokas.bence@prolan.hu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/phy/sfp.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 3679a43f4eb02..8152e14250f2d 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -2394,8 +2394,7 @@ static void sfp_sm_module(struct sfp *sfp, unsigned int event)
/* Handle remove event globally, it resets this state machine */
if (event == SFP_E_REMOVE) {
- if (sfp->sm_mod_state > SFP_MOD_PROBE)
- sfp_sm_mod_remove(sfp);
+ sfp_sm_mod_remove(sfp);
sfp_sm_mod_next(sfp, SFP_MOD_EMPTY, 0);
return;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 143/267] net: hns3: fix kernel crash problem in concurrent scenario
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (141 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 142/267] net: sfp: Always call `sfp_sm_mod_remove()` on remove Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 144/267] net: hns3: add cond_resched() to hns3 ring buffer init process Greg Kroah-Hartman
` (135 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Yonglong Liu, Jijie Shao,
Simon Horman, David S. Miller, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yonglong Liu <liuyonglong@huawei.com>
[ Upstream commit 12cda920212a49fa22d9e8b9492ac4ea013310a4 ]
When link status change, the nic driver need to notify the roce
driver to handle this event, but at this time, the roce driver
may uninit, then cause kernel crash.
To fix the problem, when link status change, need to check
whether the roce registered, and when uninit, need to wait link
update finish.
Fixes: 45e92b7e4e27 ("net: hns3: add calling roce callback function when link status change")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
.../hisilicon/hns3/hns3pf/hclge_main.c | 21 ++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
index 14713454e0d82..c8059d96f64be 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -3031,9 +3031,7 @@ static void hclge_push_link_status(struct hclge_dev *hdev)
static void hclge_update_link_status(struct hclge_dev *hdev)
{
- struct hnae3_handle *rhandle = &hdev->vport[0].roce;
struct hnae3_handle *handle = &hdev->vport[0].nic;
- struct hnae3_client *rclient = hdev->roce_client;
struct hnae3_client *client = hdev->nic_client;
int state;
int ret;
@@ -3057,8 +3055,15 @@ static void hclge_update_link_status(struct hclge_dev *hdev)
client->ops->link_status_change(handle, state);
hclge_config_mac_tnl_int(hdev, state);
- if (rclient && rclient->ops->link_status_change)
- rclient->ops->link_status_change(rhandle, state);
+
+ if (test_bit(HCLGE_STATE_ROCE_REGISTERED, &hdev->state)) {
+ struct hnae3_handle *rhandle = &hdev->vport[0].roce;
+ struct hnae3_client *rclient = hdev->roce_client;
+
+ if (rclient && rclient->ops->link_status_change)
+ rclient->ops->link_status_change(rhandle,
+ state);
+ }
hclge_push_link_status(hdev);
}
@@ -11233,6 +11238,12 @@ static int hclge_init_client_instance(struct hnae3_client *client,
return ret;
}
+static bool hclge_uninit_need_wait(struct hclge_dev *hdev)
+{
+ return test_bit(HCLGE_STATE_RST_HANDLING, &hdev->state) ||
+ test_bit(HCLGE_STATE_LINK_UPDATING, &hdev->state);
+}
+
static void hclge_uninit_client_instance(struct hnae3_client *client,
struct hnae3_ae_dev *ae_dev)
{
@@ -11241,7 +11252,7 @@ static void hclge_uninit_client_instance(struct hnae3_client *client,
if (hdev->roce_client) {
clear_bit(HCLGE_STATE_ROCE_REGISTERED, &hdev->state);
- while (test_bit(HCLGE_STATE_RST_HANDLING, &hdev->state))
+ while (hclge_uninit_need_wait(hdev))
msleep(HCLGE_WAIT_RESET_DONE);
hdev->roce_client->ops->uninit_instance(&vport->roce, 0);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 144/267] net: hns3: add cond_resched() to hns3 ring buffer init process
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (142 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 143/267] net: hns3: fix kernel crash problem in concurrent scenario Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 145/267] liquidio: Adjust a NULL pointer handling path in lio_vf_rep_copy_packet Greg Kroah-Hartman
` (134 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Jie Wang, Jijie Shao, Simon Horman,
David S. Miller, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jie Wang <wangjie125@huawei.com>
[ Upstream commit 968fde83841a8c23558dfbd0a0c69d636db52b55 ]
Currently hns3 ring buffer init process would hold cpu too long with big
Tx/Rx ring depth. This could cause soft lockup.
So this patch adds cond_resched() to the process. Then cpu can break to
run other tasks instead of busy looping.
Fixes: a723fb8efe29 ("net: hns3: refine for set ring parameters")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 4 ++++
drivers/net/ethernet/hisilicon/hns3/hns3_enet.h | 2 ++
2 files changed, 6 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 677cfaa5fe08c..db9574e9fb7bc 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -3539,6 +3539,9 @@ static int hns3_alloc_ring_buffers(struct hns3_enet_ring *ring)
ret = hns3_alloc_and_attach_buffer(ring, i);
if (ret)
goto out_buffer_fail;
+
+ if (!(i % HNS3_RESCHED_BD_NUM))
+ cond_resched();
}
return 0;
@@ -5112,6 +5115,7 @@ int hns3_init_all_ring(struct hns3_nic_priv *priv)
}
u64_stats_init(&priv->ring[i].syncp);
+ cond_resched();
}
return 0;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
index acd756b0c7c9a..d36c4ed16d8dd 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h
@@ -214,6 +214,8 @@ enum hns3_nic_state {
#define HNS3_CQ_MODE_EQE 1U
#define HNS3_CQ_MODE_CQE 0U
+#define HNS3_RESCHED_BD_NUM 1024
+
enum hns3_pkt_l2t_type {
HNS3_L2_TYPE_UNICAST,
HNS3_L2_TYPE_MULTICAST,
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 145/267] liquidio: Adjust a NULL pointer handling path in lio_vf_rep_copy_packet
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (143 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 144/267] net: hns3: add cond_resched() to hns3 ring buffer init process Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 146/267] net: stmmac: dwmac-qcom-ethqos: Configure host DMA width Greg Kroah-Hartman
` (133 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Aleksandr Mishin, Simon Horman,
David S. Miller, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aleksandr Mishin <amishin@t-argos.ru>
[ Upstream commit c44711b78608c98a3e6b49ce91678cd0917d5349 ]
In lio_vf_rep_copy_packet() pg_info->page is compared to a NULL value,
but then it is unconditionally passed to skb_add_rx_frag() which looks
strange and could lead to null pointer dereference.
lio_vf_rep_copy_packet() call trace looks like:
octeon_droq_process_packets
octeon_droq_fast_process_packets
octeon_droq_dispatch_pkt
octeon_create_recv_info
...search in the dispatch_list...
->disp_fn(rdisp->rinfo, ...)
lio_vf_rep_pkt_recv(struct octeon_recv_info *recv_info, ...)
In this path there is no code which sets pg_info->page to NULL.
So this check looks unneeded and doesn't solve potential problem.
But I guess the author had reason to add a check and I have no such card
and can't do real test.
In addition, the code in the function liquidio_push_packet() in
liquidio/lio_core.c does exactly the same.
Based on this, I consider the most acceptable compromise solution to
adjust this issue by moving skb_add_rx_frag() into conditional scope.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 1f233f327913 ("liquidio: switchdev support for LiquidIO NIC")
Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/cavium/liquidio/lio_vf_rep.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/cavium/liquidio/lio_vf_rep.c b/drivers/net/ethernet/cavium/liquidio/lio_vf_rep.c
index 600de587d7a98..e70b9ccca380e 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_vf_rep.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_vf_rep.c
@@ -272,13 +272,12 @@ lio_vf_rep_copy_packet(struct octeon_device *oct,
pg_info->page_offset;
memcpy(skb->data, va, MIN_SKB_SIZE);
skb_put(skb, MIN_SKB_SIZE);
+ skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
+ pg_info->page,
+ pg_info->page_offset + MIN_SKB_SIZE,
+ len - MIN_SKB_SIZE,
+ LIO_RXBUFFER_SZ);
}
-
- skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags,
- pg_info->page,
- pg_info->page_offset + MIN_SKB_SIZE,
- len - MIN_SKB_SIZE,
- LIO_RXBUFFER_SZ);
} else {
struct octeon_skb_page_info *pg_info =
((struct octeon_skb_page_info *)(skb->cb));
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 146/267] net: stmmac: dwmac-qcom-ethqos: Configure host DMA width
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (144 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 145/267] liquidio: Adjust a NULL pointer handling path in lio_vf_rep_copy_packet Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 147/267] drm/komeda: check for error-valued pointer Greg Kroah-Hartman
` (132 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Sagar Cheluvegowda, Simon Horman,
Andrew Halaney, David S. Miller, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sagar Cheluvegowda <quic_scheluve@quicinc.com>
[ Upstream commit 0579f27249047006a818e463ee66a6c314d04cea ]
Commit 070246e4674b ("net: stmmac: Fix for mismatched host/device DMA
address width") added support in the stmmac driver for platform drivers
to indicate the host DMA width, but left it up to authors of the
specific platforms to indicate if their width differed from the addr64
register read from the MAC itself.
Qualcomm's EMAC4 integration supports only up to 36 bit width (as
opposed to the addr64 register indicating 40 bit width). Let's indicate
that in the platform driver to avoid a scenario where the driver will
allocate descriptors of size that is supported by the CPU which in our
case is 36 bit, but as the addr64 register is still capable of 40 bits
the device will use two descriptors as one address.
Fixes: 8c4d92e82d50 ("net: stmmac: dwmac-qcom-ethqos: add support for emac4 on sa8775p platforms")
Signed-off-by: Sagar Cheluvegowda <quic_scheluve@quicinc.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c
index 31631e3f89d0a..51ff53120307a 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c
@@ -93,6 +93,7 @@ struct ethqos_emac_driver_data {
bool has_emac_ge_3;
const char *link_clk_name;
bool has_integrated_pcs;
+ u32 dma_addr_width;
struct dwmac4_addrs dwmac4_addrs;
};
@@ -272,6 +273,7 @@ static const struct ethqos_emac_driver_data emac_v4_0_0_data = {
.has_emac_ge_3 = true,
.link_clk_name = "phyaux",
.has_integrated_pcs = true,
+ .dma_addr_width = 36,
.dwmac4_addrs = {
.dma_chan = 0x00008100,
.dma_chan_offset = 0x1000,
@@ -816,6 +818,8 @@ static int qcom_ethqos_probe(struct platform_device *pdev)
plat_dat->flags |= STMMAC_FLAG_RX_CLK_RUNS_IN_LPI;
if (data->has_integrated_pcs)
plat_dat->flags |= STMMAC_FLAG_HAS_INTEGRATED_PCS;
+ if (data->dma_addr_width)
+ plat_dat->host_dma_width = data->dma_addr_width;
if (ethqos->serdes_phy) {
plat_dat->serdes_powerup = qcom_ethqos_serdes_powerup;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 147/267] drm/komeda: check for error-valued pointer
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (145 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 146/267] net: stmmac: dwmac-qcom-ethqos: Configure host DMA width Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:54 ` [PATCH 6.6 148/267] drm/bridge/panel: Fix runtime warning on panel bridge release Greg Kroah-Hartman
` (131 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Amjad Ouled-Ameur, Maxime Ripard,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Amjad Ouled-Ameur <amjad.ouled-ameur@arm.com>
[ Upstream commit b880018edd3a577e50366338194dee9b899947e0 ]
komeda_pipeline_get_state() may return an error-valued pointer, thus
check the pointer for negative or null value before dereferencing.
Fixes: 502932a03fce ("drm/komeda: Add the initial scaler support for CORE")
Signed-off-by: Amjad Ouled-Ameur <amjad.ouled-ameur@arm.com>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240610102056.40406-1-amjad.ouled-ameur@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c b/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c
index f3e744172673c..f4e76b46ca327 100644
--- a/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c
@@ -259,7 +259,7 @@ komeda_component_get_avail_scaler(struct komeda_component *c,
u32 avail_scalers;
pipe_st = komeda_pipeline_get_state(c->pipeline, state);
- if (!pipe_st)
+ if (IS_ERR_OR_NULL(pipe_st))
return NULL;
avail_scalers = (pipe_st->active_comps & KOMEDA_PIPELINE_SCALERS) ^
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 148/267] drm/bridge/panel: Fix runtime warning on panel bridge release
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (146 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 147/267] drm/komeda: check for error-valued pointer Greg Kroah-Hartman
@ 2024-06-19 12:54 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 149/267] tcp: fix race in tcp_v6_syn_recv_sock() Greg Kroah-Hartman
` (130 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:54 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Adam Miotk, Maxime Ripard,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Adam Miotk <adam.miotk@arm.com>
[ Upstream commit ce62600c4dbee8d43b02277669dd91785a9b81d9 ]
Device managed panel bridge wrappers are created by calling to
drm_panel_bridge_add_typed() and registering a release handler for
clean-up when the device gets unbound.
Since the memory for this bridge is also managed and linked to the panel
device, the release function should not try to free that memory.
Moreover, the call to devm_kfree() inside drm_panel_bridge_remove() will
fail in this case and emit a warning because the panel bridge resource
is no longer on the device resources list (it has been removed from
there before the call to release handlers).
Fixes: 67022227ffb1 ("drm/bridge: Add a devm_ allocator for panel bridge.")
Signed-off-by: Adam Miotk <adam.miotk@arm.com>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240610102739.139852-1-adam.miotk@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/gpu/drm/bridge/panel.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/bridge/panel.c b/drivers/gpu/drm/bridge/panel.c
index 9316384b44745..a1dd2ead8dcc4 100644
--- a/drivers/gpu/drm/bridge/panel.c
+++ b/drivers/gpu/drm/bridge/panel.c
@@ -360,9 +360,12 @@ EXPORT_SYMBOL(drm_panel_bridge_set_orientation);
static void devm_drm_panel_bridge_release(struct device *dev, void *res)
{
- struct drm_bridge **bridge = res;
+ struct drm_bridge *bridge = *(struct drm_bridge **)res;
- drm_panel_bridge_remove(*bridge);
+ if (!bridge)
+ return;
+
+ drm_bridge_remove(bridge);
}
/**
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 149/267] tcp: fix race in tcp_v6_syn_recv_sock()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (147 preceding siblings ...)
2024-06-19 12:54 ` [PATCH 6.6 148/267] drm/bridge/panel: Fix runtime warning on panel bridge release Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 150/267] net dsa: qca8k: fix usages of device_get_named_child_node() Greg Kroah-Hartman
` (129 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Eric Dumazet, Simon Horman,
David S. Miller, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet <edumazet@google.com>
[ Upstream commit d37fe4255abe8e7b419b90c5847e8ec2b8debb08 ]
tcp_v6_syn_recv_sock() calls ip6_dst_store() before
inet_sk(newsk)->pinet6 has been set up.
This means ip6_dst_store() writes over the parent (listener)
np->dst_cookie.
This is racy because multiple threads could share the same
parent and their final np->dst_cookie could be wrong.
Move ip6_dst_store() call after inet_sk(newsk)->pinet6
has been changed and after the copy of parent ipv6_pinfo.
Fixes: e994b2f0fb92 ("tcp: do not lock listener to process SYN packets")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/ipv6/tcp_ipv6.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 3783334ef2332..07bcb690932e1 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1287,7 +1287,6 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
*/
newsk->sk_gso_type = SKB_GSO_TCPV6;
- ip6_dst_store(newsk, dst, NULL, NULL);
inet6_sk_rx_dst_set(newsk, skb);
inet_sk(newsk)->pinet6 = tcp_inet6_sk(newsk);
@@ -1298,6 +1297,8 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
memcpy(newnp, np, sizeof(struct ipv6_pinfo));
+ ip6_dst_store(newsk, dst, NULL, NULL);
+
newsk->sk_v6_daddr = ireq->ir_v6_rmt_addr;
newnp->saddr = ireq->ir_v6_loc_addr;
newsk->sk_v6_rcv_saddr = ireq->ir_v6_loc_addr;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 150/267] net dsa: qca8k: fix usages of device_get_named_child_node()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (148 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 149/267] tcp: fix race in tcp_v6_syn_recv_sock() Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 151/267] geneve: Fix incorrect inner network header offset when innerprotoinherit is set Greg Kroah-Hartman
` (128 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Simon Horman, Andy Shevchenko,
David S. Miller, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[ Upstream commit d029edefed39647c797c2710aedd9d31f84c069e ]
The documentation for device_get_named_child_node() mentions this
important point:
"
The caller is responsible for calling fwnode_handle_put() on the
returned fwnode pointer.
"
Add fwnode_handle_put() to avoid leaked references.
Fixes: 1e264f9d2918 ("net: dsa: qca8k: add LEDs basic support")
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/dsa/qca/qca8k-leds.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/net/dsa/qca/qca8k-leds.c b/drivers/net/dsa/qca/qca8k-leds.c
index e8c16e76e34bb..77a79c2494022 100644
--- a/drivers/net/dsa/qca/qca8k-leds.c
+++ b/drivers/net/dsa/qca/qca8k-leds.c
@@ -431,8 +431,11 @@ qca8k_parse_port_leds(struct qca8k_priv *priv, struct fwnode_handle *port, int p
init_data.devname_mandatory = true;
init_data.devicename = kasprintf(GFP_KERNEL, "%s:0%d", ds->slave_mii_bus->id,
port_num);
- if (!init_data.devicename)
+ if (!init_data.devicename) {
+ fwnode_handle_put(led);
+ fwnode_handle_put(leds);
return -ENOMEM;
+ }
ret = devm_led_classdev_register_ext(priv->dev, &port_led->cdev, &init_data);
if (ret)
@@ -441,6 +444,7 @@ qca8k_parse_port_leds(struct qca8k_priv *priv, struct fwnode_handle *port, int p
kfree(init_data.devicename);
}
+ fwnode_handle_put(leds);
return 0;
}
@@ -471,9 +475,13 @@ qca8k_setup_led_ctrl(struct qca8k_priv *priv)
* the correct port for LED setup.
*/
ret = qca8k_parse_port_leds(priv, port, qca8k_port_to_phy(port_num));
- if (ret)
+ if (ret) {
+ fwnode_handle_put(port);
+ fwnode_handle_put(ports);
return ret;
+ }
}
+ fwnode_handle_put(ports);
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 151/267] geneve: Fix incorrect inner network header offset when innerprotoinherit is set
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (149 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 150/267] net dsa: qca8k: fix usages of device_get_named_child_node() Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 152/267] net/mlx5e: Fix features validation check for tunneled UDP (non-VXLAN) packets Greg Kroah-Hartman
` (127 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Gal Pressman, Tariq Toukan,
Wojciech Drewek, David S. Miller, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gal Pressman <gal@nvidia.com>
[ Upstream commit c6ae073f5903f6c6439d0ac855836a4da5c0a701 ]
When innerprotoinherit is set, the tunneled packets do not have an inner
Ethernet header.
Change 'maclen' to not always assume the header length is ETH_HLEN, as
there might not be a MAC header.
This resolves issues with drivers (e.g. mlx5, in
mlx5e_tx_tunnel_accel()) who rely on the skb inner network header offset
to be correct, and use it for TX offloads.
Fixes: d8a6213d70ac ("geneve: fix header validation in geneve[6]_xmit_skb")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/geneve.c | 10 ++++++----
include/net/ip_tunnels.h | 5 +++--
2 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 0a18b67d0d669..8333a5620deff 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -915,6 +915,7 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
struct geneve_dev *geneve,
const struct ip_tunnel_info *info)
{
+ bool inner_proto_inherit = geneve->cfg.inner_proto_inherit;
bool xnet = !net_eq(geneve->net, dev_net(geneve->dev));
struct geneve_sock *gs4 = rcu_dereference(geneve->sock4);
const struct ip_tunnel_key *key = &info->key;
@@ -926,7 +927,7 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
__be16 sport;
int err;
- if (!skb_vlan_inet_prepare(skb))
+ if (!skb_vlan_inet_prepare(skb, inner_proto_inherit))
return -EINVAL;
sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
@@ -999,7 +1000,7 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
}
err = geneve_build_skb(&rt->dst, skb, info, xnet, sizeof(struct iphdr),
- geneve->cfg.inner_proto_inherit);
+ inner_proto_inherit);
if (unlikely(err))
return err;
@@ -1015,6 +1016,7 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
struct geneve_dev *geneve,
const struct ip_tunnel_info *info)
{
+ bool inner_proto_inherit = geneve->cfg.inner_proto_inherit;
bool xnet = !net_eq(geneve->net, dev_net(geneve->dev));
struct geneve_sock *gs6 = rcu_dereference(geneve->sock6);
const struct ip_tunnel_key *key = &info->key;
@@ -1024,7 +1026,7 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
__be16 sport;
int err;
- if (!skb_vlan_inet_prepare(skb))
+ if (!skb_vlan_inet_prepare(skb, inner_proto_inherit))
return -EINVAL;
sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
@@ -1079,7 +1081,7 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
ttl = ttl ? : ip6_dst_hoplimit(dst);
}
err = geneve_build_skb(dst, skb, info, xnet, sizeof(struct ipv6hdr),
- geneve->cfg.inner_proto_inherit);
+ inner_proto_inherit);
if (unlikely(err))
return err;
diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
index 822f0fad39623..4e69f52a51177 100644
--- a/include/net/ip_tunnels.h
+++ b/include/net/ip_tunnels.h
@@ -362,9 +362,10 @@ static inline bool pskb_inet_may_pull(struct sk_buff *skb)
/* Variant of pskb_inet_may_pull().
*/
-static inline bool skb_vlan_inet_prepare(struct sk_buff *skb)
+static inline bool skb_vlan_inet_prepare(struct sk_buff *skb,
+ bool inner_proto_inherit)
{
- int nhlen = 0, maclen = ETH_HLEN;
+ int nhlen = 0, maclen = inner_proto_inherit ? 0 : ETH_HLEN;
__be16 type = skb->protocol;
/* Essentially this is skb_protocol(skb, true)
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 152/267] net/mlx5e: Fix features validation check for tunneled UDP (non-VXLAN) packets
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (150 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 151/267] geneve: Fix incorrect inner network header offset when innerprotoinherit is set Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 153/267] Bluetooth: L2CAP: Fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ Greg Kroah-Hartman
` (126 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Gal Pressman, Dragos Tatulea,
Tariq Toukan, Wojciech Drewek, David S. Miller, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gal Pressman <gal@nvidia.com>
[ Upstream commit 791b4089e326271424b78f2fae778b20e53d071b ]
Move the vxlan_features_check() call to after we verified the packet is
a tunneled VXLAN packet.
Without this, tunneled UDP non-VXLAN packets (for ex. GENENVE) might
wrongly not get offloaded.
In some cases, it worked by chance as GENEVE header is the same size as
VXLAN, but it is obviously incorrect.
Fixes: e3cfc7e6b7bd ("net/mlx5e: TX, Add geneve tunnel stateless offload support")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 455907b1167a0..e87a776ea2bfd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -4704,7 +4704,7 @@ static netdev_features_t mlx5e_tunnel_features_check(struct mlx5e_priv *priv,
/* Verify if UDP port is being offloaded by HW */
if (mlx5_vxlan_lookup_port(priv->mdev->vxlan, port))
- return features;
+ return vxlan_features_check(skb, features);
#if IS_ENABLED(CONFIG_GENEVE)
/* Support Geneve offload for default UDP port */
@@ -4730,7 +4730,6 @@ netdev_features_t mlx5e_features_check(struct sk_buff *skb,
struct mlx5e_priv *priv = netdev_priv(netdev);
features = vlan_features_check(skb, features);
- features = vxlan_features_check(skb, features);
/* Validate if the tunneled packet is being offloaded by HW */
if (skb->encapsulation &&
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 153/267] Bluetooth: L2CAP: Fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (151 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 152/267] net/mlx5e: Fix features validation check for tunneled UDP (non-VXLAN) packets Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 154/267] Bluetooth: fix connection setup in l2cap_connect Greg Kroah-Hartman
` (125 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Luiz Augusto von Dentz, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
[ Upstream commit 806a5198c05987b748b50f3d0c0cfb3d417381a4 ]
This removes the bogus check for max > hcon->le_conn_max_interval since
the later is just the initial maximum conn interval not the maximum the
stack could support which is really 3200=4000ms.
In order to pass GAP/CONN/CPUP/BV-05-C one shall probably enter values
of the following fields in IXIT that would cause hci_check_conn_params
to fail:
TSPX_conn_update_int_min
TSPX_conn_update_int_max
TSPX_conn_update_peripheral_latency
TSPX_conn_update_supervision_timeout
Link: https://github.com/bluez/bluez/issues/847
Fixes: e4b019515f95 ("Bluetooth: Enforce validation on max value of connection interval")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
include/net/bluetooth/hci_core.h | 36 ++++++++++++++++++++++++++++----
net/bluetooth/l2cap_core.c | 8 +------
2 files changed, 33 insertions(+), 11 deletions(-)
diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
index f786d2d62fa5e..f89d6d43ba8f1 100644
--- a/include/net/bluetooth/hci_core.h
+++ b/include/net/bluetooth/hci_core.h
@@ -2071,18 +2071,46 @@ static inline int hci_check_conn_params(u16 min, u16 max, u16 latency,
{
u16 max_latency;
- if (min > max || min < 6 || max > 3200)
+ if (min > max) {
+ BT_WARN("min %d > max %d", min, max);
return -EINVAL;
+ }
+
+ if (min < 6) {
+ BT_WARN("min %d < 6", min);
+ return -EINVAL;
+ }
+
+ if (max > 3200) {
+ BT_WARN("max %d > 3200", max);
+ return -EINVAL;
+ }
+
+ if (to_multiplier < 10) {
+ BT_WARN("to_multiplier %d < 10", to_multiplier);
+ return -EINVAL;
+ }
- if (to_multiplier < 10 || to_multiplier > 3200)
+ if (to_multiplier > 3200) {
+ BT_WARN("to_multiplier %d > 3200", to_multiplier);
return -EINVAL;
+ }
- if (max >= to_multiplier * 8)
+ if (max >= to_multiplier * 8) {
+ BT_WARN("max %d >= to_multiplier %d * 8", max, to_multiplier);
return -EINVAL;
+ }
max_latency = (to_multiplier * 4 / max) - 1;
- if (latency > 499 || latency > max_latency)
+ if (latency > 499) {
+ BT_WARN("latency %d > 499", latency);
return -EINVAL;
+ }
+
+ if (latency > max_latency) {
+ BT_WARN("latency %d > max_latency %d", latency, max_latency);
+ return -EINVAL;
+ }
return 0;
}
diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
index 37210567fbfbe..d5fb78c604cf3 100644
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -4645,13 +4645,7 @@ static inline int l2cap_conn_param_update_req(struct l2cap_conn *conn,
memset(&rsp, 0, sizeof(rsp));
- if (max > hcon->le_conn_max_interval) {
- BT_DBG("requested connection interval exceeds current bounds.");
- err = -EINVAL;
- } else {
- err = hci_check_conn_params(min, max, latency, to_multiplier);
- }
-
+ err = hci_check_conn_params(min, max, latency, to_multiplier);
if (err)
rsp.result = cpu_to_le16(L2CAP_CONN_PARAM_REJECTED);
else
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 154/267] Bluetooth: fix connection setup in l2cap_connect
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (152 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 153/267] Bluetooth: L2CAP: Fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 155/267] netfilter: nft_inner: validate mandatory meta and payload Greg Kroah-Hartman
` (124 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Pauli Virtanen,
Luiz Augusto von Dentz, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pauli Virtanen <pav@iki.fi>
[ Upstream commit c695439d198d30e10553a3b98360c5efe77b6903 ]
The amp_id argument of l2cap_connect() was removed in
commit 84a4bb6548a2 ("Bluetooth: HCI: Remove HCI_AMP support")
It was always called with amp_id == 0, i.e. AMP_ID_BREDR == 0x00 (ie.
non-AMP controller). In the above commit, the code path for amp_id != 0
was preserved, although it should have used the amp_id == 0 one.
Restore the previous behavior of the non-AMP code path, to fix problems
with L2CAP connections.
Fixes: 84a4bb6548a2 ("Bluetooth: HCI: Remove HCI_AMP support")
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/bluetooth/l2cap_core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
index d5fb78c604cf3..bf31c5bae218f 100644
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -4009,8 +4009,8 @@ static void l2cap_connect(struct l2cap_conn *conn, struct l2cap_cmd_hdr *cmd,
status = L2CAP_CS_AUTHOR_PEND;
chan->ops->defer(chan);
} else {
- l2cap_state_change(chan, BT_CONNECT2);
- result = L2CAP_CR_PEND;
+ l2cap_state_change(chan, BT_CONFIG);
+ result = L2CAP_CR_SUCCESS;
status = L2CAP_CS_NO_INFO;
}
} else {
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 155/267] netfilter: nft_inner: validate mandatory meta and payload
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (153 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 154/267] Bluetooth: fix connection setup in l2cap_connect Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 156/267] netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type Greg Kroah-Hartman
` (123 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Davide Ornaghi, Pablo Neira Ayuso,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Davide Ornaghi <d.ornaghi97@gmail.com>
[ Upstream commit c4ab9da85b9df3692f861512fe6c9812f38b7471 ]
Check for mandatory netlink attributes in payload and meta expression
when used embedded from the inner expression, otherwise NULL pointer
dereference is possible from userspace.
Fixes: a150d122b6bd ("netfilter: nft_meta: add inner match support")
Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching")
Signed-off-by: Davide Ornaghi <d.ornaghi97@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/netfilter/nft_meta.c | 3 +++
net/netfilter/nft_payload.c | 4 ++++
2 files changed, 7 insertions(+)
diff --git a/net/netfilter/nft_meta.c b/net/netfilter/nft_meta.c
index ba0d3683a45d3..9139ce38ea7b9 100644
--- a/net/netfilter/nft_meta.c
+++ b/net/netfilter/nft_meta.c
@@ -839,6 +839,9 @@ static int nft_meta_inner_init(const struct nft_ctx *ctx,
struct nft_meta *priv = nft_expr_priv(expr);
unsigned int len;
+ if (!tb[NFTA_META_KEY] || !tb[NFTA_META_DREG])
+ return -EINVAL;
+
priv->key = ntohl(nla_get_be32(tb[NFTA_META_KEY]));
switch (priv->key) {
case NFT_META_PROTOCOL:
diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c
index 0c43d748e23ae..50429cbd42da4 100644
--- a/net/netfilter/nft_payload.c
+++ b/net/netfilter/nft_payload.c
@@ -650,6 +650,10 @@ static int nft_payload_inner_init(const struct nft_ctx *ctx,
struct nft_payload *priv = nft_expr_priv(expr);
u32 base;
+ if (!tb[NFTA_PAYLOAD_BASE] || !tb[NFTA_PAYLOAD_OFFSET] ||
+ !tb[NFTA_PAYLOAD_LEN] || !tb[NFTA_PAYLOAD_DREG])
+ return -EINVAL;
+
base = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_BASE]));
switch (base) {
case NFT_PAYLOAD_TUN_HEADER:
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 156/267] netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (154 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 155/267] netfilter: nft_inner: validate mandatory meta and payload Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 13:09 ` Pablo Neira Ayuso
2024-06-19 12:55 ` [PATCH 6.6 157/267] x86/asm: Use %c/%n instead of %P operand modifier in asm templates Greg Kroah-Hartman
` (122 subsequent siblings)
278 siblings, 1 reply; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Lion Ackermann, Jozsef Kadlecsik,
Pablo Neira Ayuso, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jozsef Kadlecsik <kadlec@netfilter.org>
[ Upstream commit 4e7aaa6b82d63e8ddcbfb56b4fd3d014ca586f10 ]
Lion Ackermann reported that there is a race condition between namespace cleanup
in ipset and the garbage collection of the list:set type. The namespace
cleanup can destroy the list:set type of sets while the gc of the set type is
waiting to run in rcu cleanup. The latter uses data from the destroyed set which
thus leads use after free. The patch contains the following parts:
- When destroying all sets, first remove the garbage collectors, then wait
if needed and then destroy the sets.
- Fix the badly ordered "wait then remove gc" for the destroy a single set
case.
- Fix the missing rcu locking in the list:set type in the userspace test
case.
- Use proper RCU list handlings in the list:set type.
The patch depends on c1193d9bbbd3 (netfilter: ipset: Add list flush to cancel_gc).
Fixes: 97f7cf1cd80e (netfilter: ipset: fix performance regression in swap operation)
Reported-by: Lion Ackermann <nnamrec@gmail.com>
Tested-by: Lion Ackermann <nnamrec@gmail.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/netfilter/ipset/ip_set_core.c | 81 +++++++++++++++------------
net/netfilter/ipset/ip_set_list_set.c | 30 +++++-----
2 files changed, 60 insertions(+), 51 deletions(-)
diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c
index 3184cc6be4c9d..c7ae4d9bf3d24 100644
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -1172,23 +1172,50 @@ ip_set_setname_policy[IPSET_ATTR_CMD_MAX + 1] = {
.len = IPSET_MAXNAMELEN - 1 },
};
+/* In order to return quickly when destroying a single set, it is split
+ * into two stages:
+ * - Cancel garbage collector
+ * - Destroy the set itself via call_rcu()
+ */
+
static void
-ip_set_destroy_set(struct ip_set *set)
+ip_set_destroy_set_rcu(struct rcu_head *head)
{
- pr_debug("set: %s\n", set->name);
+ struct ip_set *set = container_of(head, struct ip_set, rcu);
- /* Must call it without holding any lock */
set->variant->destroy(set);
module_put(set->type->me);
kfree(set);
}
static void
-ip_set_destroy_set_rcu(struct rcu_head *head)
+_destroy_all_sets(struct ip_set_net *inst)
{
- struct ip_set *set = container_of(head, struct ip_set, rcu);
+ struct ip_set *set;
+ ip_set_id_t i;
+ bool need_wait = false;
- ip_set_destroy_set(set);
+ /* First cancel gc's: set:list sets are flushed as well */
+ for (i = 0; i < inst->ip_set_max; i++) {
+ set = ip_set(inst, i);
+ if (set) {
+ set->variant->cancel_gc(set);
+ if (set->type->features & IPSET_TYPE_NAME)
+ need_wait = true;
+ }
+ }
+ /* Must wait for flush to be really finished */
+ if (need_wait)
+ rcu_barrier();
+ for (i = 0; i < inst->ip_set_max; i++) {
+ set = ip_set(inst, i);
+ if (set) {
+ ip_set(inst, i) = NULL;
+ set->variant->destroy(set);
+ module_put(set->type->me);
+ kfree(set);
+ }
+ }
}
static int ip_set_destroy(struct sk_buff *skb, const struct nfnl_info *info,
@@ -1202,11 +1229,10 @@ static int ip_set_destroy(struct sk_buff *skb, const struct nfnl_info *info,
if (unlikely(protocol_min_failed(attr)))
return -IPSET_ERR_PROTOCOL;
-
/* Commands are serialized and references are
* protected by the ip_set_ref_lock.
* External systems (i.e. xt_set) must call
- * ip_set_put|get_nfnl_* functions, that way we
+ * ip_set_nfnl_get_* functions, that way we
* can safely check references here.
*
* list:set timer can only decrement the reference
@@ -1214,8 +1240,6 @@ static int ip_set_destroy(struct sk_buff *skb, const struct nfnl_info *info,
* without holding the lock.
*/
if (!attr[IPSET_ATTR_SETNAME]) {
- /* Must wait for flush to be really finished in list:set */
- rcu_barrier();
read_lock_bh(&ip_set_ref_lock);
for (i = 0; i < inst->ip_set_max; i++) {
s = ip_set(inst, i);
@@ -1226,15 +1250,7 @@ static int ip_set_destroy(struct sk_buff *skb, const struct nfnl_info *info,
}
inst->is_destroyed = true;
read_unlock_bh(&ip_set_ref_lock);
- for (i = 0; i < inst->ip_set_max; i++) {
- s = ip_set(inst, i);
- if (s) {
- ip_set(inst, i) = NULL;
- /* Must cancel garbage collectors */
- s->variant->cancel_gc(s);
- ip_set_destroy_set(s);
- }
- }
+ _destroy_all_sets(inst);
/* Modified by ip_set_destroy() only, which is serialized */
inst->is_destroyed = false;
} else {
@@ -1255,12 +1271,12 @@ static int ip_set_destroy(struct sk_buff *skb, const struct nfnl_info *info,
features = s->type->features;
ip_set(inst, i) = NULL;
read_unlock_bh(&ip_set_ref_lock);
+ /* Must cancel garbage collectors */
+ s->variant->cancel_gc(s);
if (features & IPSET_TYPE_NAME) {
/* Must wait for flush to be really finished */
rcu_barrier();
}
- /* Must cancel garbage collectors */
- s->variant->cancel_gc(s);
call_rcu(&s->rcu, ip_set_destroy_set_rcu);
}
return 0;
@@ -2365,30 +2381,25 @@ ip_set_net_init(struct net *net)
}
static void __net_exit
-ip_set_net_exit(struct net *net)
+ip_set_net_pre_exit(struct net *net)
{
struct ip_set_net *inst = ip_set_pernet(net);
- struct ip_set *set = NULL;
- ip_set_id_t i;
-
inst->is_deleted = true; /* flag for ip_set_nfnl_put */
+}
- nfnl_lock(NFNL_SUBSYS_IPSET);
- for (i = 0; i < inst->ip_set_max; i++) {
- set = ip_set(inst, i);
- if (set) {
- ip_set(inst, i) = NULL;
- set->variant->cancel_gc(set);
- ip_set_destroy_set(set);
- }
- }
- nfnl_unlock(NFNL_SUBSYS_IPSET);
+static void __net_exit
+ip_set_net_exit(struct net *net)
+{
+ struct ip_set_net *inst = ip_set_pernet(net);
+
+ _destroy_all_sets(inst);
kvfree(rcu_dereference_protected(inst->ip_set_list, 1));
}
static struct pernet_operations ip_set_net_ops = {
.init = ip_set_net_init,
+ .pre_exit = ip_set_net_pre_exit,
.exit = ip_set_net_exit,
.id = &ip_set_net_id,
.size = sizeof(struct ip_set_net),
diff --git a/net/netfilter/ipset/ip_set_list_set.c b/net/netfilter/ipset/ip_set_list_set.c
index 54e2a1dd7f5f5..bfae7066936bb 100644
--- a/net/netfilter/ipset/ip_set_list_set.c
+++ b/net/netfilter/ipset/ip_set_list_set.c
@@ -79,7 +79,7 @@ list_set_kadd(struct ip_set *set, const struct sk_buff *skb,
struct set_elem *e;
int ret;
- list_for_each_entry(e, &map->members, list) {
+ list_for_each_entry_rcu(e, &map->members, list) {
if (SET_WITH_TIMEOUT(set) &&
ip_set_timeout_expired(ext_timeout(e, set)))
continue;
@@ -99,7 +99,7 @@ list_set_kdel(struct ip_set *set, const struct sk_buff *skb,
struct set_elem *e;
int ret;
- list_for_each_entry(e, &map->members, list) {
+ list_for_each_entry_rcu(e, &map->members, list) {
if (SET_WITH_TIMEOUT(set) &&
ip_set_timeout_expired(ext_timeout(e, set)))
continue;
@@ -188,9 +188,10 @@ list_set_utest(struct ip_set *set, void *value, const struct ip_set_ext *ext,
struct list_set *map = set->data;
struct set_adt_elem *d = value;
struct set_elem *e, *next, *prev = NULL;
- int ret;
+ int ret = 0;
- list_for_each_entry(e, &map->members, list) {
+ rcu_read_lock();
+ list_for_each_entry_rcu(e, &map->members, list) {
if (SET_WITH_TIMEOUT(set) &&
ip_set_timeout_expired(ext_timeout(e, set)))
continue;
@@ -201,6 +202,7 @@ list_set_utest(struct ip_set *set, void *value, const struct ip_set_ext *ext,
if (d->before == 0) {
ret = 1;
+ goto out;
} else if (d->before > 0) {
next = list_next_entry(e, list);
ret = !list_is_last(&e->list, &map->members) &&
@@ -208,9 +210,11 @@ list_set_utest(struct ip_set *set, void *value, const struct ip_set_ext *ext,
} else {
ret = prev && prev->id == d->refid;
}
- return ret;
+ goto out;
}
- return 0;
+out:
+ rcu_read_unlock();
+ return ret;
}
static void
@@ -239,7 +243,7 @@ list_set_uadd(struct ip_set *set, void *value, const struct ip_set_ext *ext,
/* Find where to add the new entry */
n = prev = next = NULL;
- list_for_each_entry(e, &map->members, list) {
+ list_for_each_entry_rcu(e, &map->members, list) {
if (SET_WITH_TIMEOUT(set) &&
ip_set_timeout_expired(ext_timeout(e, set)))
continue;
@@ -316,9 +320,9 @@ list_set_udel(struct ip_set *set, void *value, const struct ip_set_ext *ext,
{
struct list_set *map = set->data;
struct set_adt_elem *d = value;
- struct set_elem *e, *next, *prev = NULL;
+ struct set_elem *e, *n, *next, *prev = NULL;
- list_for_each_entry(e, &map->members, list) {
+ list_for_each_entry_safe(e, n, &map->members, list) {
if (SET_WITH_TIMEOUT(set) &&
ip_set_timeout_expired(ext_timeout(e, set)))
continue;
@@ -424,14 +428,8 @@ static void
list_set_destroy(struct ip_set *set)
{
struct list_set *map = set->data;
- struct set_elem *e, *n;
- list_for_each_entry_safe(e, n, &map->members, list) {
- list_del(&e->list);
- ip_set_put_byindex(map->net, e->id);
- ip_set_ext_destroy(set, e);
- kfree(e);
- }
+ WARN_ON_ONCE(!list_empty(&map->members));
kfree(map);
set->data = NULL;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* Re: [PATCH 6.6 156/267] netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type
2024-06-19 12:55 ` [PATCH 6.6 156/267] netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type Greg Kroah-Hartman
@ 2024-06-19 13:09 ` Pablo Neira Ayuso
0 siblings, 0 replies; 282+ messages in thread
From: Pablo Neira Ayuso @ 2024-06-19 13:09 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: stable, patches, Lion Ackermann, Jozsef Kadlecsik, Sasha Levin
Hi Greg,
Please, hold on with this one patch for ipset until an incremental fix hits upstream.
On Wed, Jun 19, 2024 at 02:55:07PM +0200, Greg Kroah-Hartman wrote:
> 6.6-stable review patch. If anyone has any objections, please let me know.
>
> ------------------
>
> From: Jozsef Kadlecsik <kadlec@netfilter.org>
>
> [ Upstream commit 4e7aaa6b82d63e8ddcbfb56b4fd3d014ca586f10 ]
>
> Lion Ackermann reported that there is a race condition between namespace cleanup
> in ipset and the garbage collection of the list:set type. The namespace
> cleanup can destroy the list:set type of sets while the gc of the set type is
> waiting to run in rcu cleanup. The latter uses data from the destroyed set which
> thus leads use after free. The patch contains the following parts:
>
> - When destroying all sets, first remove the garbage collectors, then wait
> if needed and then destroy the sets.
> - Fix the badly ordered "wait then remove gc" for the destroy a single set
> case.
> - Fix the missing rcu locking in the list:set type in the userspace test
> case.
> - Use proper RCU list handlings in the list:set type.
>
> The patch depends on c1193d9bbbd3 (netfilter: ipset: Add list flush to cancel_gc).
>
> Fixes: 97f7cf1cd80e (netfilter: ipset: fix performance regression in swap operation)
> Reported-by: Lion Ackermann <nnamrec@gmail.com>
> Tested-by: Lion Ackermann <nnamrec@gmail.com>
> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
> net/netfilter/ipset/ip_set_core.c | 81 +++++++++++++++------------
> net/netfilter/ipset/ip_set_list_set.c | 30 +++++-----
> 2 files changed, 60 insertions(+), 51 deletions(-)
>
> diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c
> index 3184cc6be4c9d..c7ae4d9bf3d24 100644
> --- a/net/netfilter/ipset/ip_set_core.c
> +++ b/net/netfilter/ipset/ip_set_core.c
> @@ -1172,23 +1172,50 @@ ip_set_setname_policy[IPSET_ATTR_CMD_MAX + 1] = {
> .len = IPSET_MAXNAMELEN - 1 },
> };
>
> +/* In order to return quickly when destroying a single set, it is split
> + * into two stages:
> + * - Cancel garbage collector
> + * - Destroy the set itself via call_rcu()
> + */
> +
> static void
> -ip_set_destroy_set(struct ip_set *set)
> +ip_set_destroy_set_rcu(struct rcu_head *head)
> {
> - pr_debug("set: %s\n", set->name);
> + struct ip_set *set = container_of(head, struct ip_set, rcu);
>
> - /* Must call it without holding any lock */
> set->variant->destroy(set);
> module_put(set->type->me);
> kfree(set);
> }
>
> static void
> -ip_set_destroy_set_rcu(struct rcu_head *head)
> +_destroy_all_sets(struct ip_set_net *inst)
> {
> - struct ip_set *set = container_of(head, struct ip_set, rcu);
> + struct ip_set *set;
> + ip_set_id_t i;
> + bool need_wait = false;
>
> - ip_set_destroy_set(set);
> + /* First cancel gc's: set:list sets are flushed as well */
> + for (i = 0; i < inst->ip_set_max; i++) {
> + set = ip_set(inst, i);
> + if (set) {
> + set->variant->cancel_gc(set);
> + if (set->type->features & IPSET_TYPE_NAME)
> + need_wait = true;
> + }
> + }
> + /* Must wait for flush to be really finished */
> + if (need_wait)
> + rcu_barrier();
> + for (i = 0; i < inst->ip_set_max; i++) {
> + set = ip_set(inst, i);
> + if (set) {
> + ip_set(inst, i) = NULL;
> + set->variant->destroy(set);
> + module_put(set->type->me);
> + kfree(set);
> + }
> + }
> }
>
> static int ip_set_destroy(struct sk_buff *skb, const struct nfnl_info *info,
> @@ -1202,11 +1229,10 @@ static int ip_set_destroy(struct sk_buff *skb, const struct nfnl_info *info,
> if (unlikely(protocol_min_failed(attr)))
> return -IPSET_ERR_PROTOCOL;
>
> -
> /* Commands are serialized and references are
> * protected by the ip_set_ref_lock.
> * External systems (i.e. xt_set) must call
> - * ip_set_put|get_nfnl_* functions, that way we
> + * ip_set_nfnl_get_* functions, that way we
> * can safely check references here.
> *
> * list:set timer can only decrement the reference
> @@ -1214,8 +1240,6 @@ static int ip_set_destroy(struct sk_buff *skb, const struct nfnl_info *info,
> * without holding the lock.
> */
> if (!attr[IPSET_ATTR_SETNAME]) {
> - /* Must wait for flush to be really finished in list:set */
> - rcu_barrier();
> read_lock_bh(&ip_set_ref_lock);
> for (i = 0; i < inst->ip_set_max; i++) {
> s = ip_set(inst, i);
> @@ -1226,15 +1250,7 @@ static int ip_set_destroy(struct sk_buff *skb, const struct nfnl_info *info,
> }
> inst->is_destroyed = true;
> read_unlock_bh(&ip_set_ref_lock);
> - for (i = 0; i < inst->ip_set_max; i++) {
> - s = ip_set(inst, i);
> - if (s) {
> - ip_set(inst, i) = NULL;
> - /* Must cancel garbage collectors */
> - s->variant->cancel_gc(s);
> - ip_set_destroy_set(s);
> - }
> - }
> + _destroy_all_sets(inst);
> /* Modified by ip_set_destroy() only, which is serialized */
> inst->is_destroyed = false;
> } else {
> @@ -1255,12 +1271,12 @@ static int ip_set_destroy(struct sk_buff *skb, const struct nfnl_info *info,
> features = s->type->features;
> ip_set(inst, i) = NULL;
> read_unlock_bh(&ip_set_ref_lock);
> + /* Must cancel garbage collectors */
> + s->variant->cancel_gc(s);
> if (features & IPSET_TYPE_NAME) {
> /* Must wait for flush to be really finished */
> rcu_barrier();
> }
> - /* Must cancel garbage collectors */
> - s->variant->cancel_gc(s);
> call_rcu(&s->rcu, ip_set_destroy_set_rcu);
> }
> return 0;
> @@ -2365,30 +2381,25 @@ ip_set_net_init(struct net *net)
> }
>
> static void __net_exit
> -ip_set_net_exit(struct net *net)
> +ip_set_net_pre_exit(struct net *net)
> {
> struct ip_set_net *inst = ip_set_pernet(net);
>
> - struct ip_set *set = NULL;
> - ip_set_id_t i;
> -
> inst->is_deleted = true; /* flag for ip_set_nfnl_put */
> +}
>
> - nfnl_lock(NFNL_SUBSYS_IPSET);
> - for (i = 0; i < inst->ip_set_max; i++) {
> - set = ip_set(inst, i);
> - if (set) {
> - ip_set(inst, i) = NULL;
> - set->variant->cancel_gc(set);
> - ip_set_destroy_set(set);
> - }
> - }
> - nfnl_unlock(NFNL_SUBSYS_IPSET);
> +static void __net_exit
> +ip_set_net_exit(struct net *net)
> +{
> + struct ip_set_net *inst = ip_set_pernet(net);
> +
> + _destroy_all_sets(inst);
> kvfree(rcu_dereference_protected(inst->ip_set_list, 1));
> }
>
> static struct pernet_operations ip_set_net_ops = {
> .init = ip_set_net_init,
> + .pre_exit = ip_set_net_pre_exit,
> .exit = ip_set_net_exit,
> .id = &ip_set_net_id,
> .size = sizeof(struct ip_set_net),
> diff --git a/net/netfilter/ipset/ip_set_list_set.c b/net/netfilter/ipset/ip_set_list_set.c
> index 54e2a1dd7f5f5..bfae7066936bb 100644
> --- a/net/netfilter/ipset/ip_set_list_set.c
> +++ b/net/netfilter/ipset/ip_set_list_set.c
> @@ -79,7 +79,7 @@ list_set_kadd(struct ip_set *set, const struct sk_buff *skb,
> struct set_elem *e;
> int ret;
>
> - list_for_each_entry(e, &map->members, list) {
> + list_for_each_entry_rcu(e, &map->members, list) {
> if (SET_WITH_TIMEOUT(set) &&
> ip_set_timeout_expired(ext_timeout(e, set)))
> continue;
> @@ -99,7 +99,7 @@ list_set_kdel(struct ip_set *set, const struct sk_buff *skb,
> struct set_elem *e;
> int ret;
>
> - list_for_each_entry(e, &map->members, list) {
> + list_for_each_entry_rcu(e, &map->members, list) {
> if (SET_WITH_TIMEOUT(set) &&
> ip_set_timeout_expired(ext_timeout(e, set)))
> continue;
> @@ -188,9 +188,10 @@ list_set_utest(struct ip_set *set, void *value, const struct ip_set_ext *ext,
> struct list_set *map = set->data;
> struct set_adt_elem *d = value;
> struct set_elem *e, *next, *prev = NULL;
> - int ret;
> + int ret = 0;
>
> - list_for_each_entry(e, &map->members, list) {
> + rcu_read_lock();
> + list_for_each_entry_rcu(e, &map->members, list) {
> if (SET_WITH_TIMEOUT(set) &&
> ip_set_timeout_expired(ext_timeout(e, set)))
> continue;
> @@ -201,6 +202,7 @@ list_set_utest(struct ip_set *set, void *value, const struct ip_set_ext *ext,
>
> if (d->before == 0) {
> ret = 1;
> + goto out;
> } else if (d->before > 0) {
> next = list_next_entry(e, list);
> ret = !list_is_last(&e->list, &map->members) &&
> @@ -208,9 +210,11 @@ list_set_utest(struct ip_set *set, void *value, const struct ip_set_ext *ext,
> } else {
> ret = prev && prev->id == d->refid;
> }
> - return ret;
> + goto out;
> }
> - return 0;
> +out:
> + rcu_read_unlock();
> + return ret;
> }
>
> static void
> @@ -239,7 +243,7 @@ list_set_uadd(struct ip_set *set, void *value, const struct ip_set_ext *ext,
>
> /* Find where to add the new entry */
> n = prev = next = NULL;
> - list_for_each_entry(e, &map->members, list) {
> + list_for_each_entry_rcu(e, &map->members, list) {
> if (SET_WITH_TIMEOUT(set) &&
> ip_set_timeout_expired(ext_timeout(e, set)))
> continue;
> @@ -316,9 +320,9 @@ list_set_udel(struct ip_set *set, void *value, const struct ip_set_ext *ext,
> {
> struct list_set *map = set->data;
> struct set_adt_elem *d = value;
> - struct set_elem *e, *next, *prev = NULL;
> + struct set_elem *e, *n, *next, *prev = NULL;
>
> - list_for_each_entry(e, &map->members, list) {
> + list_for_each_entry_safe(e, n, &map->members, list) {
> if (SET_WITH_TIMEOUT(set) &&
> ip_set_timeout_expired(ext_timeout(e, set)))
> continue;
> @@ -424,14 +428,8 @@ static void
> list_set_destroy(struct ip_set *set)
> {
> struct list_set *map = set->data;
> - struct set_elem *e, *n;
>
> - list_for_each_entry_safe(e, n, &map->members, list) {
> - list_del(&e->list);
> - ip_set_put_byindex(map->net, e->id);
> - ip_set_ext_destroy(set, e);
> - kfree(e);
> - }
> + WARN_ON_ONCE(!list_empty(&map->members));
> kfree(map);
>
> set->data = NULL;
> --
> 2.43.0
>
>
>
^ permalink raw reply [flat|nested] 282+ messages in thread
* [PATCH 6.6 157/267] x86/asm: Use %c/%n instead of %P operand modifier in asm templates
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (155 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 156/267] netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 158/267] x86/uaccess: Fix missed zeroing of ia32 u64 get_user() range checking Greg Kroah-Hartman
` (121 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Uros Bizjak, Ingo Molnar,
Linus Torvalds, Josh Poimboeuf, Ard Biesheuvel, H. Peter Anvin,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Uros Bizjak <ubizjak@gmail.com>
[ Upstream commit 41cd2e1ee96e56401a18dbce6f42f0bdaebcbf3b ]
The "P" asm operand modifier is a x86 target-specific modifier.
When used with a constant, the "P" modifier emits "cst" instead of
"$cst". This property is currently used to emit the bare constant
without all syntax-specific prefixes.
The generic "c" resp. "n" operand modifier should be used instead.
No functional changes intended.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lore.kernel.org/r/20240319104418.284519-3-ubizjak@gmail.com
Stable-dep-of: 8c860ed825cb ("x86/uaccess: Fix missed zeroing of ia32 u64 get_user() range checking")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/x86/boot/main.c | 4 ++--
arch/x86/include/asm/alternative.h | 22 +++++++++++-----------
arch/x86/include/asm/atomic64_32.h | 2 +-
arch/x86/include/asm/cpufeature.h | 2 +-
arch/x86/include/asm/irq_stack.h | 2 +-
arch/x86/include/asm/uaccess.h | 4 ++--
6 files changed, 18 insertions(+), 18 deletions(-)
diff --git a/arch/x86/boot/main.c b/arch/x86/boot/main.c
index c4ea5258ab558..9049f390d8347 100644
--- a/arch/x86/boot/main.c
+++ b/arch/x86/boot/main.c
@@ -119,8 +119,8 @@ static void init_heap(void)
char *stack_end;
if (boot_params.hdr.loadflags & CAN_USE_HEAP) {
- asm("leal %P1(%%esp),%0"
- : "=r" (stack_end) : "i" (-STACK_SIZE));
+ asm("leal %n1(%%esp),%0"
+ : "=r" (stack_end) : "i" (STACK_SIZE));
heap_end = (char *)
((size_t)boot_params.hdr.heap_end_ptr + 0x200);
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 65f79092c9d9e..cb9ce0f9e78e0 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -288,10 +288,10 @@ static inline int alternatives_text_reserved(void *start, void *end)
* Otherwise, if CPU has feature1, newinstr1 is used.
* Otherwise, oldinstr is used.
*/
-#define alternative_input_2(oldinstr, newinstr1, ft_flags1, newinstr2, \
- ft_flags2, input...) \
- asm_inline volatile(ALTERNATIVE_2(oldinstr, newinstr1, ft_flags1, \
- newinstr2, ft_flags2) \
+#define alternative_input_2(oldinstr, newinstr1, ft_flags1, newinstr2, \
+ ft_flags2, input...) \
+ asm_inline volatile(ALTERNATIVE_2(oldinstr, newinstr1, ft_flags1, \
+ newinstr2, ft_flags2) \
: : "i" (0), ## input)
/* Like alternative_input, but with a single output argument */
@@ -301,7 +301,7 @@ static inline int alternatives_text_reserved(void *start, void *end)
/* Like alternative_io, but for replacing a direct call with another one. */
#define alternative_call(oldfunc, newfunc, ft_flags, output, input...) \
- asm_inline volatile (ALTERNATIVE("call %P[old]", "call %P[new]", ft_flags) \
+ asm_inline volatile (ALTERNATIVE("call %c[old]", "call %c[new]", ft_flags) \
: output : [old] "i" (oldfunc), [new] "i" (newfunc), ## input)
/*
@@ -310,12 +310,12 @@ static inline int alternatives_text_reserved(void *start, void *end)
* Otherwise, if CPU has feature1, function1 is used.
* Otherwise, old function is used.
*/
-#define alternative_call_2(oldfunc, newfunc1, ft_flags1, newfunc2, ft_flags2, \
- output, input...) \
- asm_inline volatile (ALTERNATIVE_2("call %P[old]", "call %P[new1]", ft_flags1,\
- "call %P[new2]", ft_flags2) \
- : output, ASM_CALL_CONSTRAINT \
- : [old] "i" (oldfunc), [new1] "i" (newfunc1), \
+#define alternative_call_2(oldfunc, newfunc1, ft_flags1, newfunc2, ft_flags2, \
+ output, input...) \
+ asm_inline volatile (ALTERNATIVE_2("call %c[old]", "call %c[new1]", ft_flags1, \
+ "call %c[new2]", ft_flags2) \
+ : output, ASM_CALL_CONSTRAINT \
+ : [old] "i" (oldfunc), [new1] "i" (newfunc1), \
[new2] "i" (newfunc2), ## input)
/*
diff --git a/arch/x86/include/asm/atomic64_32.h b/arch/x86/include/asm/atomic64_32.h
index 3486d91b8595f..d510405e4e1de 100644
--- a/arch/x86/include/asm/atomic64_32.h
+++ b/arch/x86/include/asm/atomic64_32.h
@@ -24,7 +24,7 @@ typedef struct {
#ifdef CONFIG_X86_CMPXCHG64
#define __alternative_atomic64(f, g, out, in...) \
- asm volatile("call %P[func]" \
+ asm volatile("call %c[func]" \
: out : [func] "i" (atomic64_##g##_cx8), ## in)
#define ATOMIC64_DECL(sym) ATOMIC64_DECL_ONE(sym##_cx8)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 686e92d2663ee..3508f3fc928d4 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -173,7 +173,7 @@ extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);
static __always_inline bool _static_cpu_has(u16 bit)
{
asm goto(
- ALTERNATIVE_TERNARY("jmp 6f", %P[feature], "", "jmp %l[t_no]")
+ ALTERNATIVE_TERNARY("jmp 6f", %c[feature], "", "jmp %l[t_no]")
".pushsection .altinstr_aux,\"ax\"\n"
"6:\n"
" testb %[bitnum]," _ASM_RIP(%P[cap_byte]) "\n"
diff --git a/arch/x86/include/asm/irq_stack.h b/arch/x86/include/asm/irq_stack.h
index 798183867d789..b71ad173f8776 100644
--- a/arch/x86/include/asm/irq_stack.h
+++ b/arch/x86/include/asm/irq_stack.h
@@ -100,7 +100,7 @@
}
#define ASM_CALL_ARG0 \
- "call %P[__func] \n" \
+ "call %c[__func] \n" \
ASM_REACHABLE
#define ASM_CALL_ARG1 \
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 237dc8cdd12b9..0f9bab92a43d7 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -78,7 +78,7 @@ extern int __get_user_bad(void);
int __ret_gu; \
register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX); \
__chk_user_ptr(ptr); \
- asm volatile("call __" #fn "_%P4" \
+ asm volatile("call __" #fn "_%c4" \
: "=a" (__ret_gu), "=r" (__val_gu), \
ASM_CALL_CONSTRAINT \
: "0" (ptr), "i" (sizeof(*(ptr)))); \
@@ -177,7 +177,7 @@ extern void __put_user_nocheck_8(void);
__chk_user_ptr(__ptr); \
__ptr_pu = __ptr; \
__val_pu = __x; \
- asm volatile("call __" #fn "_%P[size]" \
+ asm volatile("call __" #fn "_%c[size]" \
: "=c" (__ret_pu), \
ASM_CALL_CONSTRAINT \
: "0" (__ptr_pu), \
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 158/267] x86/uaccess: Fix missed zeroing of ia32 u64 get_user() range checking
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (156 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 157/267] x86/asm: Use %c/%n instead of %P operand modifier in asm templates Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 159/267] scsi: ufs: core: Quiesce request queues before checking pending cmds Greg Kroah-Hartman
` (120 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, David Gow, Kees Cook, Dave Hansen,
Kirill A. Shutemov, Qiuxu Zhuo, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kees Cook <kees@kernel.org>
[ Upstream commit 8c860ed825cb85f6672cd7b10a8f33e3498a7c81 ]
When reworking the range checking for get_user(), the get_user_8() case
on 32-bit wasn't zeroing the high register. (The jump to bad_get_user_8
was accidentally dropped.) Restore the correct error handling
destination (and rename the jump to using the expected ".L" prefix).
While here, switch to using a named argument ("size") for the call
template ("%c4" to "%c[size]") as already used in the other call
templates in this file.
Found after moving the usercopy selftests to KUnit:
# usercopy_test_invalid: EXPECTATION FAILED at
lib/usercopy_kunit.c:278
Expected val_u64 == 0, but
val_u64 == -60129542144 (0xfffffff200000000)
Closes: https://lore.kernel.org/all/CABVgOSn=tb=Lj9SxHuT4_9MTjjKVxsq-ikdXC4kGHO4CfKVmGQ@mail.gmail.com
Fixes: b19b74bc99b1 ("x86/mm: Rework address range check in get_user() and put_user()")
Reported-by: David Gow <davidgow@google.com>
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Tested-by: David Gow <davidgow@google.com>
Link: https://lore.kernel.org/all/20240610210213.work.143-kees%40kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/x86/include/asm/uaccess.h | 4 ++--
arch/x86/lib/getuser.S | 6 +++++-
2 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 0f9bab92a43d7..3a7755c1a4410 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -78,10 +78,10 @@ extern int __get_user_bad(void);
int __ret_gu; \
register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX); \
__chk_user_ptr(ptr); \
- asm volatile("call __" #fn "_%c4" \
+ asm volatile("call __" #fn "_%c[size]" \
: "=a" (__ret_gu), "=r" (__val_gu), \
ASM_CALL_CONSTRAINT \
- : "0" (ptr), "i" (sizeof(*(ptr)))); \
+ : "0" (ptr), [size] "i" (sizeof(*(ptr)))); \
instrument_get_user(__val_gu); \
(x) = (__force __typeof__(*(ptr))) __val_gu; \
__builtin_expect(__ret_gu, 0); \
diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S
index f6aad480febd3..6913fbce6544f 100644
--- a/arch/x86/lib/getuser.S
+++ b/arch/x86/lib/getuser.S
@@ -44,7 +44,11 @@
or %rdx, %rax
.else
cmp $TASK_SIZE_MAX-\size+1, %eax
+.if \size != 8
jae .Lbad_get_user
+.else
+ jae .Lbad_get_user_8
+.endif
sbb %edx, %edx /* array_index_mask_nospec() */
and %edx, %eax
.endif
@@ -154,7 +158,7 @@ SYM_CODE_END(__get_user_handle_exception)
#ifdef CONFIG_X86_32
SYM_CODE_START_LOCAL(__get_user_8_handle_exception)
ASM_CLAC
-bad_get_user_8:
+.Lbad_get_user_8:
xor %edx,%edx
xor %ecx,%ecx
mov $(-EFAULT),%_ASM_AX
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 159/267] scsi: ufs: core: Quiesce request queues before checking pending cmds
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (157 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 158/267] x86/uaccess: Fix missed zeroing of ia32 u64 get_user() range checking Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 160/267] net: pse-pd: Use EOPNOTSUPP error code instead of ENOTSUPP Greg Kroah-Hartman
` (119 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Can Guo, Ziqi Chen, Bart Van Assche,
Martin K. Petersen, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ziqi Chen <quic_ziqichen@quicinc.com>
[ Upstream commit 77691af484e28af7a692e511b9ed5ca63012ec6e ]
In ufshcd_clock_scaling_prepare(), after SCSI layer is blocked,
ufshcd_pending_cmds() is called to check whether there are pending
transactions or not. And only if there are no pending transactions can we
proceed to kickstart the clock scaling sequence.
ufshcd_pending_cmds() traverses over all SCSI devices and calls
sbitmap_weight() on their budget_map. sbitmap_weight() can be broken down
to three steps:
1. Calculate the nr outstanding bits set in the 'word' bitmap.
2. Calculate the nr outstanding bits set in the 'cleared' bitmap.
3. Subtract the result from step 1 by the result from step 2.
This can lead to a race condition as outlined below:
Assume there is one pending transaction in the request queue of one SCSI
device, say sda, and the budget token of this request is 0, the 'word' is
0x1 and the 'cleared' is 0x0.
1. When step 1 executes, it gets the result as 1.
2. Before step 2 executes, block layer tries to dispatch a new request to
sda. Since the SCSI layer is blocked, the request cannot pass through
SCSI but the block layer would do budget_get() and budget_put() to
sda's budget map regardless, so the 'word' has become 0x3 and 'cleared'
has become 0x2 (assume the new request got budget token 1).
3. When step 2 executes, it gets the result as 1.
4. When step 3 executes, it gets the result as 0, meaning there is no
pending transactions, which is wrong.
Thread A Thread B
ufshcd_pending_cmds() __blk_mq_sched_dispatch_requests()
| |
sbitmap_weight(word) |
| scsi_mq_get_budget()
| |
| scsi_mq_put_budget()
| |
sbitmap_weight(cleared)
...
When this race condition happens, the clock scaling sequence is started
with transactions still in flight, leading to subsequent hibernate enter
failure, broken link, task abort and back to back error recovery.
Fix this race condition by quiescing the request queues before calling
ufshcd_pending_cmds() so that block layer won't touch the budget map when
ufshcd_pending_cmds() is working on it. In addition, remove the SCSI layer
blocking/unblocking to reduce redundancies and latencies.
Fixes: 8d077ede48c1 ("scsi: ufs: Optimize the command queueing code")
Co-developed-by: Can Guo <quic_cang@quicinc.com>
Signed-off-by: Can Guo <quic_cang@quicinc.com>
Signed-off-by: Ziqi Chen <quic_ziqichen@quicinc.com>
Link: https://lore.kernel.org/r/1717754818-39863-1-git-send-email-quic_ziqichen@quicinc.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/ufs/core/ufshcd.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 589c90f4d4021..40689757a2690 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -1267,7 +1267,7 @@ static int ufshcd_clock_scaling_prepare(struct ufs_hba *hba, u64 timeout_us)
* make sure that there are no outstanding requests when
* clock scaling is in progress
*/
- ufshcd_scsi_block_requests(hba);
+ blk_mq_quiesce_tagset(&hba->host->tag_set);
mutex_lock(&hba->wb_mutex);
down_write(&hba->clk_scaling_lock);
@@ -1276,7 +1276,7 @@ static int ufshcd_clock_scaling_prepare(struct ufs_hba *hba, u64 timeout_us)
ret = -EBUSY;
up_write(&hba->clk_scaling_lock);
mutex_unlock(&hba->wb_mutex);
- ufshcd_scsi_unblock_requests(hba);
+ blk_mq_unquiesce_tagset(&hba->host->tag_set);
goto out;
}
@@ -1297,7 +1297,7 @@ static void ufshcd_clock_scaling_unprepare(struct ufs_hba *hba, int err, bool sc
mutex_unlock(&hba->wb_mutex);
- ufshcd_scsi_unblock_requests(hba);
+ blk_mq_unquiesce_tagset(&hba->host->tag_set);
ufshcd_release(hba);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 160/267] net: pse-pd: Use EOPNOTSUPP error code instead of ENOTSUPP
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (158 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 159/267] scsi: ufs: core: Quiesce request queues before checking pending cmds Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 161/267] gve: ignore nonrelevant GSO type bits when processing TSO headers Greg Kroah-Hartman
` (118 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Andrew Lunn, Oleksij Rempel,
Kory Maincent, Jakub Kicinski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kory Maincent <kory.maincent@bootlin.com>
[ Upstream commit 144ba8580bcb82b2686c3d1a043299d844b9a682 ]
ENOTSUPP is not a SUSV4 error code, prefer EOPNOTSUPP as reported by
checkpatch script.
Fixes: 18ff0bcda6d1 ("ethtool: add interface to interact with Ethernet Power Equipment")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://lore.kernel.org/r/20240610083426.740660-1-kory.maincent@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
include/linux/pse-pd/pse.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/pse-pd/pse.h b/include/linux/pse-pd/pse.h
index fb724c65c77bc..5ce0cd76956e0 100644
--- a/include/linux/pse-pd/pse.h
+++ b/include/linux/pse-pd/pse.h
@@ -114,14 +114,14 @@ static inline int pse_ethtool_get_status(struct pse_control *psec,
struct netlink_ext_ack *extack,
struct pse_control_status *status)
{
- return -ENOTSUPP;
+ return -EOPNOTSUPP;
}
static inline int pse_ethtool_set_config(struct pse_control *psec,
struct netlink_ext_ack *extack,
const struct pse_control_config *config)
{
- return -ENOTSUPP;
+ return -EOPNOTSUPP;
}
#endif
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 161/267] gve: ignore nonrelevant GSO type bits when processing TSO headers
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (159 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 160/267] net: pse-pd: Use EOPNOTSUPP error code instead of ENOTSUPP Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 162/267] net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs parameters Greg Kroah-Hartman
` (117 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Joshua Washington,
Praveen Kaligineedi, Harshitha Ramamurthy, Willem de Bruijn,
Eric Dumazet, Andrei Vagin, Jakub Kicinski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joshua Washington <joshwash@google.com>
[ Upstream commit 1b9f756344416e02b41439bf2324b26aa25e141c ]
TSO currently fails when the skb's gso_type field has more than one bit
set.
TSO packets can be passed from userspace using PF_PACKET, TUNTAP and a
few others, using virtio_net_hdr (e.g., PACKET_VNET_HDR). This includes
virtualization, such as QEMU, a real use-case.
The gso_type and gso_size fields as passed from userspace in
virtio_net_hdr are not trusted blindly by the kernel. It adds gso_type
|= SKB_GSO_DODGY to force the packet to enter the software GSO stack
for verification.
This issue might similarly come up when the CWR bit is set in the TCP
header for congestion control, causing the SKB_GSO_TCP_ECN gso_type bit
to be set.
Fixes: a57e5de476be ("gve: DQO: Add TX path")
Signed-off-by: Joshua Washington <joshwash@google.com>
Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Acked-by: Andrei Vagin <avagin@gmail.com>
v2 - Remove unnecessary comments, remove line break between fixes tag
and signoffs.
v3 - Add back unrelated empty line removal.
Link: https://lore.kernel.org/r/20240610225729.2985343-1-joshwash@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/google/gve/gve_tx_dqo.c | 20 +++++---------------
1 file changed, 5 insertions(+), 15 deletions(-)
diff --git a/drivers/net/ethernet/google/gve/gve_tx_dqo.c b/drivers/net/ethernet/google/gve/gve_tx_dqo.c
index 1e19b834a6130..5a44354bbdfdf 100644
--- a/drivers/net/ethernet/google/gve/gve_tx_dqo.c
+++ b/drivers/net/ethernet/google/gve/gve_tx_dqo.c
@@ -501,28 +501,18 @@ static int gve_prep_tso(struct sk_buff *skb)
if (unlikely(skb_shinfo(skb)->gso_size < GVE_TX_MIN_TSO_MSS_DQO))
return -1;
+ if (!(skb_shinfo(skb)->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6)))
+ return -EINVAL;
+
/* Needed because we will modify header. */
err = skb_cow_head(skb, 0);
if (err < 0)
return err;
tcp = tcp_hdr(skb);
-
- /* Remove payload length from checksum. */
paylen = skb->len - skb_transport_offset(skb);
-
- switch (skb_shinfo(skb)->gso_type) {
- case SKB_GSO_TCPV4:
- case SKB_GSO_TCPV6:
- csum_replace_by_diff(&tcp->check,
- (__force __wsum)htonl(paylen));
-
- /* Compute length of segmentation header. */
- header_len = skb_tcp_all_headers(skb);
- break;
- default:
- return -EINVAL;
- }
+ csum_replace_by_diff(&tcp->check, (__force __wsum)htonl(paylen));
+ header_len = skb_tcp_all_headers(skb);
if (unlikely(header_len > GVE_TX_MAX_HDR_SIZE_DQO))
return -EINVAL;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 162/267] net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs parameters
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (160 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 161/267] gve: ignore nonrelevant GSO type bits when processing TSO headers Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 163/267] block: sed-opal: avoid possible wrong address reference in read_sed_opal_key() Greg Kroah-Hartman
` (116 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Xiaolei Wang, Wojciech Drewek,
Vladimir Oltean, Jakub Kicinski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xiaolei Wang <xiaolei.wang@windriver.com>
[ Upstream commit be27b896529787e23a35ae4befb6337ce73fcca0 ]
The current cbs parameter depends on speed after uplinking,
which is not needed and will report a configuration error
if the port is not initially connected. The UAPI exposed by
tc-cbs requires userspace to recalculate the send slope anyway,
because the formula depends on port_transmit_rate (see man tc-cbs),
which is not an invariant from tc's perspective. Therefore, we
use offload->sendslope and offload->idleslope to derive the
original port_transmit_rate from the CBS formula.
Fixes: 1f705bc61aee ("net: stmmac: Add support for CBS QDISC")
Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20240608143524.2065736-1-xiaolei.wang@windriver.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
.../net/ethernet/stmicro/stmmac/stmmac_tc.c | 25 ++++++++-----------
1 file changed, 11 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
index 6ad3e0a119366..2467598f9d92f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
@@ -343,10 +343,11 @@ static int tc_setup_cbs(struct stmmac_priv *priv,
struct tc_cbs_qopt_offload *qopt)
{
u32 tx_queues_count = priv->plat->tx_queues_to_use;
+ s64 port_transmit_rate_kbps;
u32 queue = qopt->queue;
- u32 ptr, speed_div;
u32 mode_to_use;
u64 value;
+ u32 ptr;
int ret;
/* Queue 0 is not AVB capable */
@@ -355,30 +356,26 @@ static int tc_setup_cbs(struct stmmac_priv *priv,
if (!priv->dma_cap.av)
return -EOPNOTSUPP;
+ port_transmit_rate_kbps = qopt->idleslope - qopt->sendslope;
+
/* Port Transmit Rate and Speed Divider */
- switch (priv->speed) {
+ switch (div_s64(port_transmit_rate_kbps, 1000)) {
case SPEED_10000:
- ptr = 32;
- speed_div = 10000000;
- break;
case SPEED_5000:
ptr = 32;
- speed_div = 5000000;
break;
case SPEED_2500:
- ptr = 8;
- speed_div = 2500000;
- break;
case SPEED_1000:
ptr = 8;
- speed_div = 1000000;
break;
case SPEED_100:
ptr = 4;
- speed_div = 100000;
break;
default:
- return -EOPNOTSUPP;
+ netdev_err(priv->dev,
+ "Invalid portTransmitRate %lld (idleSlope - sendSlope)\n",
+ port_transmit_rate_kbps);
+ return -EINVAL;
}
mode_to_use = priv->plat->tx_queues_cfg[queue].mode_to_use;
@@ -398,10 +395,10 @@ static int tc_setup_cbs(struct stmmac_priv *priv,
}
/* Final adjustments for HW */
- value = div_s64(qopt->idleslope * 1024ll * ptr, speed_div);
+ value = div_s64(qopt->idleslope * 1024ll * ptr, port_transmit_rate_kbps);
priv->plat->tx_queues_cfg[queue].idle_slope = value & GENMASK(31, 0);
- value = div_s64(-qopt->sendslope * 1024ll * ptr, speed_div);
+ value = div_s64(-qopt->sendslope * 1024ll * ptr, port_transmit_rate_kbps);
priv->plat->tx_queues_cfg[queue].send_slope = value & GENMASK(31, 0);
value = qopt->hicredit * 1024ll * 8;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 163/267] block: sed-opal: avoid possible wrong address reference in read_sed_opal_key()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (161 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 162/267] net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs parameters Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 164/267] block: fix request.queuelist usage in flush Greg Kroah-Hartman
` (115 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Su Hui, Jens Axboe, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Su Hui <suhui@nfschina.com>
[ Upstream commit 9b1ebce6a1fded90d4a1c6c57dc6262dac4c4c14 ]
Clang static checker (scan-build) warning:
block/sed-opal.c:line 317, column 3
Value stored to 'ret' is never read.
Fix this problem by returning the error code when keyring_search() failed.
Otherwise, 'key' will have a wrong value when 'kerf' stores the error code.
Fixes: 3bfeb6125664 ("block: sed-opal: keyring support for SED keys")
Signed-off-by: Su Hui <suhui@nfschina.com>
Link: https://lore.kernel.org/r/20240611073659.429582-1-suhui@nfschina.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
block/sed-opal.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/sed-opal.c b/block/sed-opal.c
index e27109be77690..1a1cb35bf4b79 100644
--- a/block/sed-opal.c
+++ b/block/sed-opal.c
@@ -313,7 +313,7 @@ static int read_sed_opal_key(const char *key_name, u_char *buffer, int buflen)
&key_type_user, key_name, true);
if (IS_ERR(kref))
- ret = PTR_ERR(kref);
+ return PTR_ERR(kref);
key = key_ref_to_ptr(kref);
down_read(&key->sem);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 164/267] block: fix request.queuelist usage in flush
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (162 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 163/267] block: sed-opal: avoid possible wrong address reference in read_sed_opal_key() Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 165/267] nvmet-passthru: propagate status from id override functions Greg Kroah-Hartman
` (114 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Friedrich Weber, Christoph Hellwig,
ming.lei, bvanassche, Chengming Zhou, Jens Axboe, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chengming Zhou <chengming.zhou@linux.dev>
[ Upstream commit d0321c812d89c5910d8da8e4b10c891c6b96ff70 ]
Friedrich Weber reported a kernel crash problem and bisected to commit
81ada09cc25e ("blk-flush: reuse rq queuelist in flush state machine").
The root cause is that we use "list_move_tail(&rq->queuelist, pending)"
in the PREFLUSH/POSTFLUSH sequences. But rq->queuelist.next == xxx since
it's popped out from plug->cached_rq in __blk_mq_alloc_requests_batch().
We don't initialize its queuelist just for this first request, although
the queuelist of all later popped requests will be initialized.
Fix it by changing to use "list_add_tail(&rq->queuelist, pending)" so
rq->queuelist doesn't need to be initialized. It should be ok since rq
can't be on any list when PREFLUSH or POSTFLUSH, has no move actually.
Please note the commit 81ada09cc25e ("blk-flush: reuse rq queuelist in
flush state machine") also has another requirement that no drivers would
touch rq->queuelist after blk_mq_end_request() since we will reuse it to
add rq to the post-flush pending list in POSTFLUSH. If this is not true,
we will have to revert that commit IMHO.
This updated version adds "list_del_init(&rq->queuelist)" in flush rq
callback since the dm layer may submit request of a weird invalid format
(REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH), which causes double list_add
if without this "list_del_init(&rq->queuelist)". The weird invalid format
problem should be fixed in dm layer.
Reported-by: Friedrich Weber <f.weber@proxmox.com>
Closes: https://lore.kernel.org/lkml/14b89dfb-505c-49f7-aebb-01c54451db40@proxmox.com/
Closes: https://lore.kernel.org/lkml/c9d03ff7-27c5-4ebd-b3f6-5a90d96f35ba@proxmox.com/
Fixes: 81ada09cc25e ("blk-flush: reuse rq queuelist in flush state machine")
Cc: Christoph Hellwig <hch@lst.de>
Cc: ming.lei@redhat.com
Cc: bvanassche@acm.org
Tested-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Chengming Zhou <chengming.zhou@linux.dev>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20240608143115.972486-1-chengming.zhou@linux.dev
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
block/blk-flush.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/block/blk-flush.c b/block/blk-flush.c
index e73dc22d05c1d..313f0ffcce42e 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -183,7 +183,7 @@ static void blk_flush_complete_seq(struct request *rq,
/* queue for flush */
if (list_empty(pending))
fq->flush_pending_since = jiffies;
- list_move_tail(&rq->queuelist, pending);
+ list_add_tail(&rq->queuelist, pending);
break;
case REQ_FSEQ_DATA:
@@ -261,6 +261,7 @@ static enum rq_end_io_ret flush_end_io(struct request *flush_rq,
unsigned int seq = blk_flush_cur_seq(rq);
BUG_ON(seq != REQ_FSEQ_PREFLUSH && seq != REQ_FSEQ_POSTFLUSH);
+ list_del_init(&rq->queuelist);
blk_flush_complete_seq(rq, fq, seq, error);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 165/267] nvmet-passthru: propagate status from id override functions
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (163 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 164/267] block: fix request.queuelist usage in flush Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 166/267] net/ipv6: Fix the RT cache flush via sysctl using a previous delay Greg Kroah-Hartman
` (113 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Daniel Wagner, Chaitanya Kulkarni,
Christoph Hellwig, Keith Busch, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Wagner <dwagner@suse.de>
[ Upstream commit d76584e53f4244dbc154bec447c3852600acc914 ]
The id override functions return a status which is not propagated to the
caller.
Fixes: c1fef73f793b ("nvmet: add passthru code to process commands")
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/nvme/target/passthru.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/target/passthru.c b/drivers/nvme/target/passthru.c
index 9fe07d7efa96c..d4a61645d61a5 100644
--- a/drivers/nvme/target/passthru.c
+++ b/drivers/nvme/target/passthru.c
@@ -226,13 +226,13 @@ static void nvmet_passthru_execute_cmd_work(struct work_struct *w)
req->cmd->common.opcode == nvme_admin_identify) {
switch (req->cmd->identify.cns) {
case NVME_ID_CNS_CTRL:
- nvmet_passthru_override_id_ctrl(req);
+ status = nvmet_passthru_override_id_ctrl(req);
break;
case NVME_ID_CNS_NS:
- nvmet_passthru_override_id_ns(req);
+ status = nvmet_passthru_override_id_ns(req);
break;
case NVME_ID_CNS_NS_DESC_LIST:
- nvmet_passthru_override_id_descs(req);
+ status = nvmet_passthru_override_id_descs(req);
break;
}
} else if (status < 0)
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 166/267] net/ipv6: Fix the RT cache flush via sysctl using a previous delay
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (164 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 165/267] nvmet-passthru: propagate status from id override functions Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 167/267] net: bridge: mst: pass vlan group directly to br_mst_vlan_set_state Greg Kroah-Hartman
` (112 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Petr Pavlu, David Ahern,
Jakub Kicinski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Petr Pavlu <petr.pavlu@suse.com>
[ Upstream commit 14a20e5b4ad998793c5f43b0330d9e1388446cf3 ]
The net.ipv6.route.flush system parameter takes a value which specifies
a delay used during the flush operation for aging exception routes. The
written value is however not used in the currently requested flush and
instead utilized only in the next one.
A problem is that ipv6_sysctl_rtcache_flush() first reads the old value
of net->ipv6.sysctl.flush_delay into a local delay variable and then
calls proc_dointvec() which actually updates the sysctl based on the
provided input.
Fix the problem by switching the order of the two operations.
Fixes: 4990509f19e8 ("[NETNS][IPV6]: Make sysctls route per namespace.")
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20240607112828.30285-1-petr.pavlu@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/ipv6/route.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 0a37f04177337..29fa2ca07b46a 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -6332,12 +6332,12 @@ static int ipv6_sysctl_rtcache_flush(struct ctl_table *ctl, int write,
if (!write)
return -EINVAL;
- net = (struct net *)ctl->extra1;
- delay = net->ipv6.sysctl.flush_delay;
ret = proc_dointvec(ctl, write, buffer, lenp, ppos);
if (ret)
return ret;
+ net = (struct net *)ctl->extra1;
+ delay = net->ipv6.sysctl.flush_delay;
fib6_run_gc(delay <= 0 ? 0 : (unsigned long)delay, net, delay > 0);
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 167/267] net: bridge: mst: pass vlan group directly to br_mst_vlan_set_state
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (165 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 166/267] net/ipv6: Fix the RT cache flush via sysctl using a previous delay Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 168/267] net: bridge: mst: fix suspicious rcu usage in br_mst_set_state Greg Kroah-Hartman
` (111 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, syzbot+9bbe2de1bc9d470eb5fe,
Nikolay Aleksandrov, Jakub Kicinski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikolay Aleksandrov <razor@blackwall.org>
[ Upstream commit 36c92936e868601fa1f43da6758cf55805043509 ]
Pass the already obtained vlan group pointer to br_mst_vlan_set_state()
instead of dereferencing it again. Each caller has already correctly
dereferenced it for their context. This change is required for the
following suspicious RCU dereference fix. No functional changes
intended.
Fixes: 3a7c1661ae13 ("net: bridge: mst: fix vlan use-after-free")
Reported-by: syzbot+9bbe2de1bc9d470eb5fe@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9bbe2de1bc9d470eb5fe
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20240609103654.914987-2-razor@blackwall.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/bridge/br_mst.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/net/bridge/br_mst.c b/net/bridge/br_mst.c
index 3c66141d34d62..1de72816b0fb2 100644
--- a/net/bridge/br_mst.c
+++ b/net/bridge/br_mst.c
@@ -73,11 +73,10 @@ int br_mst_get_state(const struct net_device *dev, u16 msti, u8 *state)
}
EXPORT_SYMBOL_GPL(br_mst_get_state);
-static void br_mst_vlan_set_state(struct net_bridge_port *p, struct net_bridge_vlan *v,
+static void br_mst_vlan_set_state(struct net_bridge_vlan_group *vg,
+ struct net_bridge_vlan *v,
u8 state)
{
- struct net_bridge_vlan_group *vg = nbp_vlan_group(p);
-
if (br_vlan_get_state(v) == state)
return;
@@ -121,7 +120,7 @@ int br_mst_set_state(struct net_bridge_port *p, u16 msti, u8 state,
if (v->brvlan->msti != msti)
continue;
- br_mst_vlan_set_state(p, v, state);
+ br_mst_vlan_set_state(vg, v, state);
}
out:
@@ -140,13 +139,13 @@ static void br_mst_vlan_sync_state(struct net_bridge_vlan *pv, u16 msti)
* it.
*/
if (v != pv && v->brvlan->msti == msti) {
- br_mst_vlan_set_state(pv->port, pv, v->state);
+ br_mst_vlan_set_state(vg, pv, v->state);
return;
}
}
/* Otherwise, start out in a new MSTI with all ports disabled. */
- return br_mst_vlan_set_state(pv->port, pv, BR_STATE_DISABLED);
+ return br_mst_vlan_set_state(vg, pv, BR_STATE_DISABLED);
}
int br_mst_vlan_set_msti(struct net_bridge_vlan *mv, u16 msti)
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 168/267] net: bridge: mst: fix suspicious rcu usage in br_mst_set_state
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (166 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 167/267] net: bridge: mst: pass vlan group directly to br_mst_vlan_set_state Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 169/267] ionic: fix use after netif_napi_del() Greg Kroah-Hartman
` (110 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, syzbot+9bbe2de1bc9d470eb5fe,
Nikolay Aleksandrov, Jakub Kicinski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nikolay Aleksandrov <razor@blackwall.org>
[ Upstream commit 546ceb1dfdac866648ec959cbc71d9525bd73462 ]
I converted br_mst_set_state to RCU to avoid a vlan use-after-free
but forgot to change the vlan group dereference helper. Switch to vlan
group RCU deref helper to fix the suspicious rcu usage warning.
Fixes: 3a7c1661ae13 ("net: bridge: mst: fix vlan use-after-free")
Reported-by: syzbot+9bbe2de1bc9d470eb5fe@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9bbe2de1bc9d470eb5fe
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20240609103654.914987-3-razor@blackwall.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/bridge/br_mst.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/bridge/br_mst.c b/net/bridge/br_mst.c
index 1de72816b0fb2..1820f09ff59ce 100644
--- a/net/bridge/br_mst.c
+++ b/net/bridge/br_mst.c
@@ -102,7 +102,7 @@ int br_mst_set_state(struct net_bridge_port *p, u16 msti, u8 state,
int err = 0;
rcu_read_lock();
- vg = nbp_vlan_group(p);
+ vg = nbp_vlan_group_rcu(p);
if (!vg)
goto out;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 169/267] ionic: fix use after netif_napi_del()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (167 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 168/267] net: bridge: mst: fix suspicious rcu usage in br_mst_set_state Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 170/267] af_unix: Read with MSG_PEEK loops if the first unread byte is OOB Greg Kroah-Hartman
` (109 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Taehee Yoo, Brett Creeley,
Shannon Nelson, Jakub Kicinski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Taehee Yoo <ap420073@gmail.com>
[ Upstream commit 79f18a41dd056115d685f3b0a419c7cd40055e13 ]
When queues are started, netif_napi_add() and napi_enable() are called.
If there are 4 queues and only 3 queues are used for the current
configuration, only 3 queues' napi should be registered and enabled.
The ionic_qcq_enable() checks whether the .poll pointer is not NULL for
enabling only the using queue' napi. Unused queues' napi will not be
registered by netif_napi_add(), so the .poll pointer indicates NULL.
But it couldn't distinguish whether the napi was unregistered or not
because netif_napi_del() doesn't reset the .poll pointer to NULL.
So, ionic_qcq_enable() calls napi_enable() for the queue, which was
unregistered by netif_napi_del().
Reproducer:
ethtool -L <interface name> rx 1 tx 1 combined 0
ethtool -L <interface name> rx 0 tx 0 combined 1
ethtool -L <interface name> rx 0 tx 0 combined 4
Splat looks like:
kernel BUG at net/core/dev.c:6666!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 1057 Comm: kworker/3:3 Not tainted 6.10.0-rc2+ #16
Workqueue: events ionic_lif_deferred_work [ionic]
RIP: 0010:napi_enable+0x3b/0x40
Code: 48 89 c2 48 83 e2 f6 80 b9 61 09 00 00 00 74 0d 48 83 bf 60 01 00 00 00 74 03 80 ce 01 f0 4f
RSP: 0018:ffffb6ed83227d48 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff97560cda0828 RCX: 0000000000000029
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff97560cda0a28
RBP: ffffb6ed83227d50 R08: 0000000000000400 R09: 0000000000000001
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
R13: ffff97560ce3c1a0 R14: 0000000000000000 R15: ffff975613ba0a20
FS: 0000000000000000(0000) GS:ffff975d5f780000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8f734ee200 CR3: 0000000103e50000 CR4: 00000000007506f0
PKRU: 55555554
Call Trace:
<TASK>
? die+0x33/0x90
? do_trap+0xd9/0x100
? napi_enable+0x3b/0x40
? do_error_trap+0x83/0xb0
? napi_enable+0x3b/0x40
? napi_enable+0x3b/0x40
? exc_invalid_op+0x4e/0x70
? napi_enable+0x3b/0x40
? asm_exc_invalid_op+0x16/0x20
? napi_enable+0x3b/0x40
ionic_qcq_enable+0xb7/0x180 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
ionic_start_queues+0xc4/0x290 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
ionic_link_status_check+0x11c/0x170 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
ionic_lif_deferred_work+0x129/0x280 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
process_one_work+0x145/0x360
worker_thread+0x2bb/0x3d0
? __pfx_worker_thread+0x10/0x10
kthread+0xcc/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2d/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240612060446.1754392-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/pensando/ionic/ionic_lif.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
index 4f05cddc65cb4..7e6e1bed525af 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
@@ -296,10 +296,8 @@ static int ionic_qcq_enable(struct ionic_qcq *qcq)
if (ret)
return ret;
- if (qcq->napi.poll)
- napi_enable(&qcq->napi);
-
if (qcq->flags & IONIC_QCQ_F_INTR) {
+ napi_enable(&qcq->napi);
irq_set_affinity_hint(qcq->intr.vector,
&qcq->intr.affinity_mask);
ionic_intr_mask(idev->intr_ctrl, qcq->intr.index,
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 170/267] af_unix: Read with MSG_PEEK loops if the first unread byte is OOB
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (168 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 169/267] ionic: fix use after netif_napi_del() Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 171/267] bnxt_en: Adjust logging of firmware messages in case of released token in __hwrm_send() Greg Kroah-Hartman
` (108 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Rao Shoaib, Kuniyuki Iwashima,
Jakub Kicinski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rao Shoaib <Rao.Shoaib@oracle.com>
[ Upstream commit a6736a0addd60fccc3a3508461d72314cc609772 ]
Read with MSG_PEEK flag loops if the first byte to read is an OOB byte.
commit 22dd70eb2c3d ("af_unix: Don't peek OOB data without MSG_OOB.")
addresses the loop issue but does not address the issue that no data
beyond OOB byte can be read.
>>> from socket import *
>>> c1, c2 = socketpair(AF_UNIX, SOCK_STREAM)
>>> c1.send(b'a', MSG_OOB)
1
>>> c1.send(b'b')
1
>>> c2.recv(1, MSG_PEEK | MSG_DONTWAIT)
b'b'
>>> from socket import *
>>> c1, c2 = socketpair(AF_UNIX, SOCK_STREAM)
>>> c2.setsockopt(SOL_SOCKET, SO_OOBINLINE, 1)
>>> c1.send(b'a', MSG_OOB)
1
>>> c1.send(b'b')
1
>>> c2.recv(1, MSG_PEEK | MSG_DONTWAIT)
b'a'
>>> c2.recv(1, MSG_PEEK | MSG_DONTWAIT)
b'a'
>>> c2.recv(1, MSG_DONTWAIT)
b'a'
>>> c2.recv(1, MSG_PEEK | MSG_DONTWAIT)
b'b'
>>>
Fixes: 314001f0bf92 ("af_unix: Add OOB support")
Signed-off-by: Rao Shoaib <Rao.Shoaib@oracle.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20240611084639.2248934-1-Rao.Shoaib@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/unix/af_unix.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2596,18 +2596,18 @@ static struct sk_buff *manage_oob(struct
if (skb == u->oob_skb) {
if (copied) {
skb = NULL;
- } else if (sock_flag(sk, SOCK_URGINLINE)) {
- if (!(flags & MSG_PEEK)) {
+ } else if (!(flags & MSG_PEEK)) {
+ if (sock_flag(sk, SOCK_URGINLINE)) {
WRITE_ONCE(u->oob_skb, NULL);
consume_skb(skb);
+ } else {
+ __skb_unlink(skb, &sk->sk_receive_queue);
+ WRITE_ONCE(u->oob_skb, NULL);
+ unlinked_skb = skb;
+ skb = skb_peek(&sk->sk_receive_queue);
}
- } else if (flags & MSG_PEEK) {
- skb = NULL;
- } else {
- __skb_unlink(skb, &sk->sk_receive_queue);
- WRITE_ONCE(u->oob_skb, NULL);
- unlinked_skb = skb;
- skb = skb_peek(&sk->sk_receive_queue);
+ } else if (!sock_flag(sk, SOCK_URGINLINE)) {
+ skb = skb_peek_next(skb, &sk->sk_receive_queue);
}
}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 171/267] bnxt_en: Adjust logging of firmware messages in case of released token in __hwrm_send()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (169 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 170/267] af_unix: Read with MSG_PEEK loops if the first unread byte is OOB Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 172/267] misc: microchip: pci1xxxx: fix double free in the error handling of gp_aux_bus_probe() Greg Kroah-Hartman
` (107 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Michael Chan, Aleksandr Mishin,
Wojciech Drewek, Jakub Kicinski, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aleksandr Mishin <amishin@t-argos.ru>
[ Upstream commit a9b9741854a9fe9df948af49ca5514e0ed0429df ]
In case of token is released due to token->state == BNXT_HWRM_DEFERRED,
released token (set to NULL) is used in log messages. This issue is
expected to be prevented by HWRM_ERR_CODE_PF_UNAVAILABLE error code. But
this error code is returned by recent firmware. So some firmware may not
return it. This may lead to NULL pointer dereference.
Adjust this issue by adding token pointer check.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 8fa4219dba8e ("bnxt_en: add dynamic debug support for HWRM messages")
Suggested-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20240611082547.12178-1-amishin@t-argos.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.c
index 132442f16fe67..7a4e08b5a8c1b 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_hwrm.c
@@ -678,7 +678,7 @@ static int __hwrm_send(struct bnxt *bp, struct bnxt_hwrm_ctx *ctx)
req_type);
else if (rc && rc != HWRM_ERR_CODE_PF_UNAVAILABLE)
hwrm_err(bp, ctx, "hwrm req_type 0x%x seq id 0x%x error 0x%x\n",
- req_type, token->seq_id, rc);
+ req_type, le16_to_cpu(ctx->req->seq_id), rc);
rc = __hwrm_to_stderr(rc);
exit:
if (token)
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 172/267] misc: microchip: pci1xxxx: fix double free in the error handling of gp_aux_bus_probe()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (170 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 171/267] bnxt_en: Adjust logging of firmware messages in case of released token in __hwrm_send() Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 173/267] ksmbd: move leading slash check to smb2_get_name() Greg Kroah-Hartman
` (106 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Yongzhi Liu
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yongzhi Liu <hyperlyzcs@gmail.com>
commit 086c6cbcc563c81d55257f9b27e14faf1d0963d3 upstream.
When auxiliary_device_add() returns error and then calls
auxiliary_device_uninit(), callback function
gp_auxiliary_device_release() calls ida_free() and
kfree(aux_device_wrapper) to free memory. We should't
call them again in the error handling path.
Fix this by skipping the redundant cleanup functions.
Fixes: 393fc2f5948f ("misc: microchip: pci1xxxx: load auxiliary bus driver for the PIO function in the multi-function endpoint of pci1xxxx device.")
Signed-off-by: Yongzhi Liu <hyperlyzcs@gmail.com>
Link: https://lore.kernel.org/r/20240523121434.21855-3-hyperlyzcs@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gp.c | 3 +++
1 file changed, 3 insertions(+)
--- a/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gp.c
+++ b/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gp.c
@@ -111,6 +111,7 @@ static int gp_aux_bus_probe(struct pci_d
err_aux_dev_add_1:
auxiliary_device_uninit(&aux_bus->aux_device_wrapper[1]->aux_dev);
+ goto err_aux_dev_add_0;
err_aux_dev_init_1:
ida_free(&gp_client_ida, aux_bus->aux_device_wrapper[1]->aux_dev.id);
@@ -120,6 +121,7 @@ err_ida_alloc_1:
err_aux_dev_add_0:
auxiliary_device_uninit(&aux_bus->aux_device_wrapper[0]->aux_dev);
+ goto err_ret;
err_aux_dev_init_0:
ida_free(&gp_client_ida, aux_bus->aux_device_wrapper[0]->aux_dev.id);
@@ -127,6 +129,7 @@ err_aux_dev_init_0:
err_ida_alloc_0:
kfree(aux_bus->aux_device_wrapper[0]);
+err_ret:
return retval;
}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 173/267] ksmbd: move leading slash check to smb2_get_name()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (171 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 172/267] misc: microchip: pci1xxxx: fix double free in the error handling of gp_aux_bus_probe() Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 174/267] ksmbd: fix missing use of get_write in in smb2_set_ea() Greg Kroah-Hartman
` (105 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Namjae Jeon, Steve French
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namjae Jeon <linkinjeon@kernel.org>
commit 1cdeca6a7264021e20157de0baf7880ff0ced822 upstream.
If the directory name in the root of the share starts with
character like 镜(0x955c) or Ṝ(0x1e5c), it (and anything inside)
cannot be accessed. The leading slash check must be checked after
converting unicode to nls string.
Cc: stable@vger.kernel.org
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/smb/server/smb2pdu.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
--- a/fs/smb/server/smb2pdu.c
+++ b/fs/smb/server/smb2pdu.c
@@ -630,6 +630,12 @@ smb2_get_name(const char *src, const int
return name;
}
+ if (*name == '\\') {
+ pr_err("not allow directory name included leading slash\n");
+ kfree(name);
+ return ERR_PTR(-EINVAL);
+ }
+
ksmbd_conv_path_to_unix(name);
ksmbd_strip_last_slash(name);
return name;
@@ -2842,20 +2848,11 @@ int smb2_open(struct ksmbd_work *work)
}
if (req->NameLength) {
- if ((req->CreateOptions & FILE_DIRECTORY_FILE_LE) &&
- *(char *)req->Buffer == '\\') {
- pr_err("not allow directory name included leading slash\n");
- rc = -EINVAL;
- goto err_out2;
- }
-
name = smb2_get_name((char *)req + le16_to_cpu(req->NameOffset),
le16_to_cpu(req->NameLength),
work->conn->local_nls);
if (IS_ERR(name)) {
rc = PTR_ERR(name);
- if (rc != -ENOMEM)
- rc = -ENOENT;
name = NULL;
goto err_out2;
}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 174/267] ksmbd: fix missing use of get_write in in smb2_set_ea()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (172 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 173/267] ksmbd: move leading slash check to smb2_get_name() Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 175/267] x86/boot: Dont add the EFI stub to targets, again Greg Kroah-Hartman
` (104 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Wang Zhaolong, Namjae Jeon,
Steve French
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namjae Jeon <linkinjeon@kernel.org>
commit 2bfc4214c69c62da13a9da8e3c3db5539da2ccd3 upstream.
Fix an issue where get_write is not used in smb2_set_ea().
Fixes: 6fc0a265e1b9 ("ksmbd: fix potential circular locking issue in smb2_set_ea()")
Cc: stable@vger.kernel.org
Reported-by: Wang Zhaolong <wangzhaolong1@huawei.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/smb/server/smb2pdu.c | 7 ++++---
fs/smb/server/vfs.c | 17 +++++++++++------
fs/smb/server/vfs.h | 3 ++-
fs/smb/server/vfs_cache.c | 3 ++-
4 files changed, 19 insertions(+), 11 deletions(-)
--- a/fs/smb/server/smb2pdu.c
+++ b/fs/smb/server/smb2pdu.c
@@ -2367,7 +2367,8 @@ static int smb2_set_ea(struct smb2_ea_in
if (rc > 0) {
rc = ksmbd_vfs_remove_xattr(idmap,
path,
- attr_name);
+ attr_name,
+ get_write);
if (rc < 0) {
ksmbd_debug(SMB,
@@ -2382,7 +2383,7 @@ static int smb2_set_ea(struct smb2_ea_in
} else {
rc = ksmbd_vfs_setxattr(idmap, path, attr_name, value,
le16_to_cpu(eabuf->EaValueLength),
- 0, true);
+ 0, get_write);
if (rc < 0) {
ksmbd_debug(SMB,
"ksmbd_vfs_setxattr is failed(%d)\n",
@@ -2474,7 +2475,7 @@ static int smb2_remove_smb_xattrs(const
!strncmp(&name[XATTR_USER_PREFIX_LEN], STREAM_PREFIX,
STREAM_PREFIX_LEN)) {
err = ksmbd_vfs_remove_xattr(idmap, path,
- name);
+ name, true);
if (err)
ksmbd_debug(SMB, "remove xattr failed : %s\n",
name);
--- a/fs/smb/server/vfs.c
+++ b/fs/smb/server/vfs.c
@@ -1053,16 +1053,21 @@ int ksmbd_vfs_fqar_lseek(struct ksmbd_fi
}
int ksmbd_vfs_remove_xattr(struct mnt_idmap *idmap,
- const struct path *path, char *attr_name)
+ const struct path *path, char *attr_name,
+ bool get_write)
{
int err;
- err = mnt_want_write(path->mnt);
- if (err)
- return err;
+ if (get_write == true) {
+ err = mnt_want_write(path->mnt);
+ if (err)
+ return err;
+ }
err = vfs_removexattr(idmap, path->dentry, attr_name);
- mnt_drop_write(path->mnt);
+
+ if (get_write == true)
+ mnt_drop_write(path->mnt);
return err;
}
@@ -1375,7 +1380,7 @@ int ksmbd_vfs_remove_sd_xattrs(struct mn
ksmbd_debug(SMB, "%s, len %zd\n", name, strlen(name));
if (!strncmp(name, XATTR_NAME_SD, XATTR_NAME_SD_LEN)) {
- err = ksmbd_vfs_remove_xattr(idmap, path, name);
+ err = ksmbd_vfs_remove_xattr(idmap, path, name, true);
if (err)
ksmbd_debug(SMB, "remove xattr failed : %s\n", name);
}
--- a/fs/smb/server/vfs.h
+++ b/fs/smb/server/vfs.h
@@ -114,7 +114,8 @@ int ksmbd_vfs_setxattr(struct mnt_idmap
int ksmbd_vfs_xattr_stream_name(char *stream_name, char **xattr_stream_name,
size_t *xattr_stream_name_size, int s_type);
int ksmbd_vfs_remove_xattr(struct mnt_idmap *idmap,
- const struct path *path, char *attr_name);
+ const struct path *path, char *attr_name,
+ bool get_write);
int ksmbd_vfs_kern_path_locked(struct ksmbd_work *work, char *name,
unsigned int flags, struct path *parent_path,
struct path *path, bool caseless);
--- a/fs/smb/server/vfs_cache.c
+++ b/fs/smb/server/vfs_cache.c
@@ -254,7 +254,8 @@ static void __ksmbd_inode_close(struct k
ci->m_flags &= ~S_DEL_ON_CLS_STREAM;
err = ksmbd_vfs_remove_xattr(file_mnt_idmap(filp),
&filp->f_path,
- fp->stream.name);
+ fp->stream.name,
+ true);
if (err)
pr_err("remove xattr failed : %s\n",
fp->stream.name);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 175/267] x86/boot: Dont add the EFI stub to targets, again
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (173 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 174/267] ksmbd: fix missing use of get_write in in smb2_set_ea() Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 176/267] iio: adc: ad9467: fix scan type sign Greg Kroah-Hartman
` (103 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Ben Segall, Borislav Petkov (AMD),
Ard Biesheuvel
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Benjamin Segall <bsegall@google.com>
commit b2747f108b8034271fd5289bd8f3a7003e0775a3 upstream.
This is a re-commit of
da05b143a308 ("x86/boot: Don't add the EFI stub to targets")
after the tagged patch incorrectly reverted it.
vmlinux-objs-y is added to targets, with an assumption that they are all
relative to $(obj); adding a $(objtree)/drivers/... path causes the
build to incorrectly create a useless
arch/x86/boot/compressed/drivers/... directory tree.
Fix this just by using a different make variable for the EFI stub.
Fixes: cb8bda8ad443 ("x86/boot/compressed: Rename efi_thunk_64.S to efi-mixed.S")
Signed-off-by: Ben Segall <bsegall@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: stable@vger.kernel.org # v6.1+
Link: https://lore.kernel.org/r/xm267ceukksz.fsf@bsegall.svl.corp.google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/x86/boot/compressed/Makefile | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -116,9 +116,9 @@ vmlinux-objs-$(CONFIG_UNACCEPTED_MEMORY)
vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi.o
vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_mixed.o
-vmlinux-objs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
+vmlinux-libs-$(CONFIG_EFI_STUB) += $(objtree)/drivers/firmware/efi/libstub/lib.a
-$(obj)/vmlinux: $(vmlinux-objs-y) FORCE
+$(obj)/vmlinux: $(vmlinux-objs-y) $(vmlinux-libs-y) FORCE
$(call if_changed,ld)
OBJCOPYFLAGS_vmlinux.bin := -R .comment -S
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 176/267] iio: adc: ad9467: fix scan type sign
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (174 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 175/267] x86/boot: Dont add the EFI stub to targets, again Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 177/267] iio: dac: ad5592r: fix temperature channel scaling value Greg Kroah-Hartman
` (102 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, David Lechner, Stable,
Jonathan Cameron
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Lechner <dlechner@baylibre.com>
commit 8a01ef749b0a632f0e1f4ead0f08b3310d99fcb1 upstream.
According to the IIO documentation, the sign in the scan type should be
lower case. The ad9467 driver was incorrectly using upper case.
Fix by changing to lower case.
Fixes: 4606d0f4b05f ("iio: adc: ad9467: add support for AD9434 high-speed ADC")
Fixes: ad6797120238 ("iio: adc: ad9467: add support AD9467 ADC")
Signed-off-by: David Lechner <dlechner@baylibre.com>
Link: https://lore.kernel.org/r/20240503-ad9467-fix-scan-type-sign-v1-1-c7a1a066ebb9@baylibre.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/iio/adc/ad9467.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/iio/adc/ad9467.c
+++ b/drivers/iio/adc/ad9467.c
@@ -225,11 +225,11 @@ static void __ad9467_get_scale(struct ad
}
static const struct iio_chan_spec ad9434_channels[] = {
- AD9467_CHAN(0, 0, 12, 'S'),
+ AD9467_CHAN(0, 0, 12, 's'),
};
static const struct iio_chan_spec ad9467_channels[] = {
- AD9467_CHAN(0, 0, 16, 'S'),
+ AD9467_CHAN(0, 0, 16, 's'),
};
static const struct ad9467_chip_info ad9467_chip_tbl = {
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 177/267] iio: dac: ad5592r: fix temperature channel scaling value
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (175 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 176/267] iio: adc: ad9467: fix scan type sign Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 178/267] iio: invensense: fix odr switching to same value Greg Kroah-Hartman
` (101 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Marc Ferland, Stable,
Jonathan Cameron
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Ferland <marc.ferland@sonatest.com>
commit 279428df888319bf68f2686934897301a250bb84 upstream.
The scale value for the temperature channel is (assuming Vref=2.5 and
the datasheet):
376.7897513
When calculating both val and val2 for the temperature scale we
use (3767897513/25) and multiply it by Vref (here I assume 2500mV) to
obtain:
2500 * (3767897513/25) ==> 376789751300
Finally we divide with remainder by 10^9 to get:
val = 376
val2 = 789751300
However, we return IIO_VAL_INT_PLUS_MICRO (should have been NANO) as
the scale type. So when converting the raw temperature value to the
'processed' temperature value we will get (assuming raw=810,
offset=-753):
processed = (raw + offset) * scale_val
= (810 + -753) * 376
= 21432
processed += div((raw + offset) * scale_val2, 10^6)
+= div((810 + -753) * 789751300, 10^6)
+= 45015
==> 66447
==> 66.4 Celcius
instead of the expected 21.5 Celsius.
Fix this issue by changing IIO_VAL_INT_PLUS_MICRO to
IIO_VAL_INT_PLUS_NANO.
Fixes: 56ca9db862bf ("iio: dac: Add support for the AD5592R/AD5593R ADCs/DACs")
Signed-off-by: Marc Ferland <marc.ferland@sonatest.com>
Link: https://lore.kernel.org/r/20240501150554.1871390-1-marc.ferland@sonatest.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/iio/dac/ad5592r-base.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iio/dac/ad5592r-base.c
+++ b/drivers/iio/dac/ad5592r-base.c
@@ -415,7 +415,7 @@ static int ad5592r_read_raw(struct iio_d
s64 tmp = *val * (3767897513LL / 25LL);
*val = div_s64_rem(tmp, 1000000000LL, val2);
- return IIO_VAL_INT_PLUS_MICRO;
+ return IIO_VAL_INT_PLUS_NANO;
}
mutex_lock(&st->lock);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 178/267] iio: invensense: fix odr switching to same value
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (176 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 177/267] iio: dac: ad5592r: fix temperature channel scaling value Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 179/267] iio: imu: inv_icm42600: delete unneeded update watermark call Greg Kroah-Hartman
` (100 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Jean-Baptiste Maneyrol,
Jonathan Cameron
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com>
commit 95444b9eeb8c5c0330563931d70c61ca3b101548 upstream.
ODR switching happens in 2 steps, update to store the new value and then
apply when the ODR change flag is received in the data. When switching to
the same ODR value, the ODR change flag is never happening, and frequency
switching is blocked waiting for the never coming apply.
Fix the issue by preventing update to happen when switching to same ODR
value.
Fixes: 0ecc363ccea7 ("iio: make invensense timestamp module generic")
Cc: stable@vger.kernel.org
Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com>
Link: https://lore.kernel.org/r/20240524124851.567485-1-inv.git-commit@tdk.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/iio/common/inv_sensors/inv_sensors_timestamp.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/iio/common/inv_sensors/inv_sensors_timestamp.c b/drivers/iio/common/inv_sensors/inv_sensors_timestamp.c
index fa205f17bd90..f44458c380d9 100644
--- a/drivers/iio/common/inv_sensors/inv_sensors_timestamp.c
+++ b/drivers/iio/common/inv_sensors/inv_sensors_timestamp.c
@@ -60,11 +60,15 @@ EXPORT_SYMBOL_NS_GPL(inv_sensors_timestamp_init, IIO_INV_SENSORS_TIMESTAMP);
int inv_sensors_timestamp_update_odr(struct inv_sensors_timestamp *ts,
uint32_t period, bool fifo)
{
+ uint32_t mult;
+
/* when FIFO is on, prevent odr change if one is already pending */
if (fifo && ts->new_mult != 0)
return -EAGAIN;
- ts->new_mult = period / ts->chip.clock_period;
+ mult = period / ts->chip.clock_period;
+ if (mult != ts->mult)
+ ts->new_mult = mult;
return 0;
}
--
2.45.2
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 179/267] iio: imu: inv_icm42600: delete unneeded update watermark call
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (177 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 178/267] iio: invensense: fix odr switching to same value Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 180/267] drivers: core: synchronize really_probe() and dev_uevent() Greg Kroah-Hartman
` (99 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Jean-Baptiste Maneyrol,
Jonathan Cameron
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com>
commit 245f3b149e6cc3ac6ee612cdb7042263bfc9e73c upstream.
Update watermark will be done inside the hwfifo_set_watermark callback
just after the update_scan_mode. It is useless to do it here.
Fixes: 7f85e42a6c54 ("iio: imu: inv_icm42600: add buffer support in iio devices")
Cc: stable@vger.kernel.org
Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com>
Link: https://lore.kernel.org/r/20240527210008.612932-1-inv.git-commit@tdk.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/iio/imu/inv_icm42600/inv_icm42600_accel.c | 4 ----
drivers/iio/imu/inv_icm42600/inv_icm42600_gyro.c | 4 ----
2 files changed, 8 deletions(-)
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_accel.c
+++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_accel.c
@@ -129,10 +129,6 @@ static int inv_icm42600_accel_update_sca
/* update data FIFO write */
inv_sensors_timestamp_apply_odr(ts, 0, 0, 0);
ret = inv_icm42600_buffer_set_fifo_en(st, fifo_en | st->fifo.en);
- if (ret)
- goto out_unlock;
-
- ret = inv_icm42600_buffer_update_watermark(st);
out_unlock:
mutex_unlock(&st->lock);
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_gyro.c
+++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_gyro.c
@@ -129,10 +129,6 @@ static int inv_icm42600_gyro_update_scan
/* update data FIFO write */
inv_sensors_timestamp_apply_odr(ts, 0, 0, 0);
ret = inv_icm42600_buffer_set_fifo_en(st, fifo_en | st->fifo.en);
- if (ret)
- goto out_unlock;
-
- ret = inv_icm42600_buffer_update_watermark(st);
out_unlock:
mutex_unlock(&st->lock);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 180/267] drivers: core: synchronize really_probe() and dev_uevent()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (178 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 179/267] iio: imu: inv_icm42600: delete unneeded update watermark call Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 181/267] parisc: Try to fix random segmentation faults in package builds Greg Kroah-Hartman
` (98 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, stable, syzbot+ffa8143439596313a85a,
Ashish Sangwan, Namjae Jeon, Dirk Behme
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dirk Behme <dirk.behme@de.bosch.com>
commit c0a40097f0bc81deafc15f9195d1fb54595cd6d0 upstream.
Synchronize the dev->driver usage in really_probe() and dev_uevent().
These can run in different threads, what can result in the following
race condition for dev->driver uninitialization:
Thread #1:
==========
really_probe() {
...
probe_failed:
...
device_unbind_cleanup(dev) {
...
dev->driver = NULL; // <= Failed probe sets dev->driver to NULL
...
}
...
}
Thread #2:
==========
dev_uevent() {
...
if (dev->driver)
// If dev->driver is NULLed from really_probe() from here on,
// after above check, the system crashes
add_uevent_var(env, "DRIVER=%s", dev->driver->name);
...
}
really_probe() holds the lock, already. So nothing needs to be done
there. dev_uevent() is called with lock held, often, too. But not
always. What implies that we can't add any locking in dev_uevent()
itself. So fix this race by adding the lock to the non-protected
path. This is the path where above race is observed:
dev_uevent+0x235/0x380
uevent_show+0x10c/0x1f0 <= Add lock here
dev_attr_show+0x3a/0xa0
sysfs_kf_seq_show+0x17c/0x250
kernfs_seq_show+0x7c/0x90
seq_read_iter+0x2d7/0x940
kernfs_fop_read_iter+0xc6/0x310
vfs_read+0x5bc/0x6b0
ksys_read+0xeb/0x1b0
__x64_sys_read+0x42/0x50
x64_sys_call+0x27ad/0x2d30
do_syscall_64+0xcd/0x1d0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Similar cases are reported by syzkaller in
https://syzkaller.appspot.com/bug?extid=ffa8143439596313a85a
But these are regarding the *initialization* of dev->driver
dev->driver = drv;
As this switches dev->driver to non-NULL these reports can be considered
to be false-positives (which should be "fixed" by this commit, as well,
though).
The same issue was reported and tried to be fixed back in 2015 in
https://lore.kernel.org/lkml/1421259054-2574-1-git-send-email-a.sangwan@samsung.com/
already.
Fixes: 239378f16aa1 ("Driver core: add uevent vars for devices of a class")
Cc: stable <stable@kernel.org>
Cc: syzbot+ffa8143439596313a85a@syzkaller.appspotmail.com
Cc: Ashish Sangwan <a.sangwan@samsung.com>
Cc: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com>
Link: https://lore.kernel.org/r/20240513050634.3964461-1-dirk.behme@de.bosch.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/base/core.c | 3 +++
1 file changed, 3 insertions(+)
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2664,8 +2664,11 @@ static ssize_t uevent_show(struct device
if (!env)
return -ENOMEM;
+ /* Synchronize with really_probe() */
+ device_lock(dev);
/* let the kset specific function add its keys */
retval = kset->uevent_ops->uevent(&dev->kobj, env);
+ device_unlock(dev);
if (retval)
goto out;
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 181/267] parisc: Try to fix random segmentation faults in package builds
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (179 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 180/267] drivers: core: synchronize really_probe() and dev_uevent() Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 182/267] ACPI: x86: Force StorageD3Enable on more products Greg Kroah-Hartman
` (97 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, John David Anglin, Helge Deller
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: John David Anglin <dave@parisc-linux.org>
commit 72d95924ee35c8cd16ef52f912483ee938a34d49 upstream.
PA-RISC systems with PA8800 and PA8900 processors have had problems
with random segmentation faults for many years. Systems with earlier
processors are much more stable.
Systems with PA8800 and PA8900 processors have a large L2 cache which
needs per page flushing for decent performance when a large range is
flushed. The combined cache in these systems is also more sensitive to
non-equivalent aliases than the caches in earlier systems.
The majority of random segmentation faults that I have looked at
appear to be memory corruption in memory allocated using mmap and
malloc.
My first attempt at fixing the random faults didn't work. On
reviewing the cache code, I realized that there were two issues
which the existing code didn't handle correctly. Both relate
to cache move-in. Another issue is that the present bit in PTEs
is racy.
1) PA-RISC caches have a mind of their own and they can speculatively
load data and instructions for a page as long as there is a entry in
the TLB for the page which allows move-in. TLBs are local to each
CPU. Thus, the TLB entry for a page must be purged before flushing
the page. This is particularly important on SMP systems.
In some of the flush routines, the flush routine would be called
and then the TLB entry would be purged. This was because the flush
routine needed the TLB entry to do the flush.
2) My initial approach to trying the fix the random faults was to
try and use flush_cache_page_if_present for all flush operations.
This actually made things worse and led to a couple of hardware
lockups. It finally dawned on me that some lines weren't being
flushed because the pte check code was racy. This resulted in
random inequivalent mappings to physical pages.
The __flush_cache_page tmpalias flush sets up its own TLB entry
and it doesn't need the existing TLB entry. As long as we can find
the pte pointer for the vm page, we can get the pfn and physical
address of the page. We can also purge the TLB entry for the page
before doing the flush. Further, __flush_cache_page uses a special
TLB entry that inhibits cache move-in.
When switching page mappings, we need to ensure that lines are
removed from the cache. It is not sufficient to just flush the
lines to memory as they may come back.
This made it clear that we needed to implement all the required
flush operations using tmpalias routines. This includes flushes
for user and kernel pages.
After modifying the code to use tmpalias flushes, it became clear
that the random segmentation faults were not fully resolved. The
frequency of faults was worse on systems with a 64 MB L2 (PA8900)
and systems with more CPUs (rp4440).
The warning that I added to flush_cache_page_if_present to detect
pages that couldn't be flushed triggered frequently on some systems.
Helge and I looked at the pages that couldn't be flushed and found
that the PTE was either cleared or for a swap page. Ignoring pages
that were swapped out seemed okay but pages with cleared PTEs seemed
problematic.
I looked at routines related to pte_clear and noticed ptep_clear_flush.
The default implementation just flushes the TLB entry. However, it was
obvious that on parisc we need to flush the cache page as well. If
we don't flush the cache page, stale lines will be left in the cache
and cause random corruption. Once a PTE is cleared, there is no way
to find the physical address associated with the PTE and flush the
associated page at a later time.
I implemented an updated change with a parisc specific version of
ptep_clear_flush. It fixed the random data corruption on Helge's rp4440
and rp3440, as well as on my c8000.
At this point, I realized that I could restore the code where we only
flush in flush_cache_page_if_present if the page has been accessed.
However, for this, we also need to flush the cache when the accessed
bit is cleared in ptep_clear_flush_young to keep things synchronized.
The default implementation only flushes the TLB entry.
Other changes in this version are:
1) Implement parisc specific version of ptep_get. It's identical to
default but needed in arch/parisc/include/asm/pgtable.h.
2) Revise parisc implementation of ptep_test_and_clear_young to use
ptep_get (READ_ONCE).
3) Drop parisc implementation of ptep_get_and_clear. We can use default.
4) Revise flush_kernel_vmap_range and invalidate_kernel_vmap_range to
use full data cache flush.
5) Move flush_cache_vmap and flush_cache_vunmap to cache.c. Handle
VM_IOREMAP case in flush_cache_vmap.
At this time, I don't know whether it is better to always flush when
the PTE present bit is set or when both the accessed and present bits
are set. The later saves flushing pages that haven't been accessed,
but we need to flush in ptep_clear_flush_young. It also needs a page
table lookup to find the PTE pointer. The lpa instruction only needs
a page table lookup when the PTE entry isn't in the TLB.
We don't atomically handle setting and clearing the _PAGE_ACCESSED bit.
If we miss an update, we may miss a flush and the cache may get corrupted.
Whether the current code is effectively atomic depends on process control.
When CONFIG_FLUSH_PAGE_ACCESSED is set to zero, the page will eventually
be flushed when the PTE is cleared or in flush_cache_page_if_present. The
_PAGE_ACCESSED bit is not used, so the problem is avoided.
The flush method can be selected using the CONFIG_FLUSH_PAGE_ACCESSED
define in cache.c. The default is 0. I didn't see a large difference
in performance.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: <stable@vger.kernel.org> # v6.6+
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/parisc/include/asm/cacheflush.h | 15 -
arch/parisc/include/asm/pgtable.h | 27 +-
arch/parisc/kernel/cache.c | 411 +++++++++++++++++++++--------------
3 files changed, 274 insertions(+), 179 deletions(-)
--- a/arch/parisc/include/asm/cacheflush.h
+++ b/arch/parisc/include/asm/cacheflush.h
@@ -31,18 +31,17 @@ void flush_cache_all_local(void);
void flush_cache_all(void);
void flush_cache_mm(struct mm_struct *mm);
-void flush_kernel_dcache_page_addr(const void *addr);
-
#define flush_kernel_dcache_range(start,size) \
flush_kernel_dcache_range_asm((start), (start)+(size));
+/* The only way to flush a vmap range is to flush whole cache */
#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
void flush_kernel_vmap_range(void *vaddr, int size);
void invalidate_kernel_vmap_range(void *vaddr, int size);
-#define flush_cache_vmap(start, end) flush_cache_all()
+void flush_cache_vmap(unsigned long start, unsigned long end);
#define flush_cache_vmap_early(start, end) do { } while (0)
-#define flush_cache_vunmap(start, end) flush_cache_all()
+void flush_cache_vunmap(unsigned long start, unsigned long end);
void flush_dcache_folio(struct folio *folio);
#define flush_dcache_folio flush_dcache_folio
@@ -77,17 +76,11 @@ void flush_cache_page(struct vm_area_str
void flush_cache_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end);
-/* defined in pacache.S exported in cache.c used by flush_anon_page */
-void flush_dcache_page_asm(unsigned long phys_addr, unsigned long vaddr);
-
#define ARCH_HAS_FLUSH_ANON_PAGE
void flush_anon_page(struct vm_area_struct *vma, struct page *page, unsigned long vmaddr);
#define ARCH_HAS_FLUSH_ON_KUNMAP
-static inline void kunmap_flush_on_unmap(const void *addr)
-{
- flush_kernel_dcache_page_addr(addr);
-}
+void kunmap_flush_on_unmap(const void *addr);
#endif /* _PARISC_CACHEFLUSH_H */
--- a/arch/parisc/include/asm/pgtable.h
+++ b/arch/parisc/include/asm/pgtable.h
@@ -448,14 +448,17 @@ static inline pte_t pte_swp_clear_exclus
return pte;
}
+static inline pte_t ptep_get(pte_t *ptep)
+{
+ return READ_ONCE(*ptep);
+}
+#define ptep_get ptep_get
+
static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep)
{
pte_t pte;
- if (!pte_young(*ptep))
- return 0;
-
- pte = *ptep;
+ pte = ptep_get(ptep);
if (!pte_young(pte)) {
return 0;
}
@@ -463,17 +466,10 @@ static inline int ptep_test_and_clear_yo
return 1;
}
-struct mm_struct;
-static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
-{
- pte_t old_pte;
-
- old_pte = *ptep;
- set_pte(ptep, __pte(0));
-
- return old_pte;
-}
+int ptep_clear_flush_young(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep);
+pte_t ptep_clear_flush(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep);
+struct mm_struct;
static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
{
set_pte(ptep, pte_wrprotect(*ptep));
@@ -511,7 +507,8 @@ static inline void ptep_set_wrprotect(st
#define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
-#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
+#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH
+#define __HAVE_ARCH_PTEP_CLEAR_FLUSH
#define __HAVE_ARCH_PTEP_SET_WRPROTECT
#define __HAVE_ARCH_PTE_SAME
--- a/arch/parisc/kernel/cache.c
+++ b/arch/parisc/kernel/cache.c
@@ -20,6 +20,7 @@
#include <linux/sched.h>
#include <linux/sched/mm.h>
#include <linux/syscalls.h>
+#include <linux/vmalloc.h>
#include <asm/pdc.h>
#include <asm/cache.h>
#include <asm/cacheflush.h>
@@ -31,20 +32,31 @@
#include <asm/mmu_context.h>
#include <asm/cachectl.h>
+#define PTR_PAGE_ALIGN_DOWN(addr) PTR_ALIGN_DOWN(addr, PAGE_SIZE)
+
+/*
+ * When nonzero, use _PAGE_ACCESSED bit to try to reduce the number
+ * of page flushes done flush_cache_page_if_present. There are some
+ * pros and cons in using this option. It may increase the risk of
+ * random segmentation faults.
+ */
+#define CONFIG_FLUSH_PAGE_ACCESSED 0
+
int split_tlb __ro_after_init;
int dcache_stride __ro_after_init;
int icache_stride __ro_after_init;
EXPORT_SYMBOL(dcache_stride);
+/* Internal implementation in arch/parisc/kernel/pacache.S */
void flush_dcache_page_asm(unsigned long phys_addr, unsigned long vaddr);
EXPORT_SYMBOL(flush_dcache_page_asm);
void purge_dcache_page_asm(unsigned long phys_addr, unsigned long vaddr);
void flush_icache_page_asm(unsigned long phys_addr, unsigned long vaddr);
-
-/* Internal implementation in arch/parisc/kernel/pacache.S */
void flush_data_cache_local(void *); /* flushes local data-cache only */
void flush_instruction_cache_local(void); /* flushes local code-cache only */
+static void flush_kernel_dcache_page_addr(const void *addr);
+
/* On some machines (i.e., ones with the Merced bus), there can be
* only a single PxTLB broadcast at a time; this must be guaranteed
* by software. We need a spinlock around all TLB flushes to ensure
@@ -317,6 +329,18 @@ __flush_cache_page(struct vm_area_struct
{
if (!static_branch_likely(&parisc_has_cache))
return;
+
+ /*
+ * The TLB is the engine of coherence on parisc. The CPU is
+ * entitled to speculate any page with a TLB mapping, so here
+ * we kill the mapping then flush the page along a special flush
+ * only alias mapping. This guarantees that the page is no-longer
+ * in the cache for any process and nor may it be speculatively
+ * read in (until the user or kernel specifically accesses it,
+ * of course).
+ */
+ flush_tlb_page(vma, vmaddr);
+
preempt_disable();
flush_dcache_page_asm(physaddr, vmaddr);
if (vma->vm_flags & VM_EXEC)
@@ -324,46 +348,44 @@ __flush_cache_page(struct vm_area_struct
preempt_enable();
}
-static void flush_user_cache_page(struct vm_area_struct *vma, unsigned long vmaddr)
+static void flush_kernel_dcache_page_addr(const void *addr)
{
- unsigned long flags, space, pgd, prot;
-#ifdef CONFIG_TLB_PTLOCK
- unsigned long pgd_lock;
-#endif
+ unsigned long vaddr = (unsigned long)addr;
+ unsigned long flags;
- vmaddr &= PAGE_MASK;
+ /* Purge TLB entry to remove translation on all CPUs */
+ purge_tlb_start(flags);
+ pdtlb(SR_KERNEL, addr);
+ purge_tlb_end(flags);
+ /* Use tmpalias flush to prevent data cache move-in */
preempt_disable();
+ flush_dcache_page_asm(__pa(vaddr), vaddr);
+ preempt_enable();
+}
- /* Set context for flush */
- local_irq_save(flags);
- prot = mfctl(8);
- space = mfsp(SR_USER);
- pgd = mfctl(25);
-#ifdef CONFIG_TLB_PTLOCK
- pgd_lock = mfctl(28);
-#endif
- switch_mm_irqs_off(NULL, vma->vm_mm, NULL);
- local_irq_restore(flags);
-
- flush_user_dcache_range_asm(vmaddr, vmaddr + PAGE_SIZE);
- if (vma->vm_flags & VM_EXEC)
- flush_user_icache_range_asm(vmaddr, vmaddr + PAGE_SIZE);
- flush_tlb_page(vma, vmaddr);
+static void flush_kernel_icache_page_addr(const void *addr)
+{
+ unsigned long vaddr = (unsigned long)addr;
+ unsigned long flags;
- /* Restore previous context */
- local_irq_save(flags);
-#ifdef CONFIG_TLB_PTLOCK
- mtctl(pgd_lock, 28);
-#endif
- mtctl(pgd, 25);
- mtsp(space, SR_USER);
- mtctl(prot, 8);
- local_irq_restore(flags);
+ /* Purge TLB entry to remove translation on all CPUs */
+ purge_tlb_start(flags);
+ pdtlb(SR_KERNEL, addr);
+ purge_tlb_end(flags);
+ /* Use tmpalias flush to prevent instruction cache move-in */
+ preempt_disable();
+ flush_icache_page_asm(__pa(vaddr), vaddr);
preempt_enable();
}
+void kunmap_flush_on_unmap(const void *addr)
+{
+ flush_kernel_dcache_page_addr(addr);
+}
+EXPORT_SYMBOL(kunmap_flush_on_unmap);
+
void flush_icache_pages(struct vm_area_struct *vma, struct page *page,
unsigned int nr)
{
@@ -371,13 +393,16 @@ void flush_icache_pages(struct vm_area_s
for (;;) {
flush_kernel_dcache_page_addr(kaddr);
- flush_kernel_icache_page(kaddr);
+ flush_kernel_icache_page_addr(kaddr);
if (--nr == 0)
break;
kaddr += PAGE_SIZE;
}
}
+/*
+ * Walk page directory for MM to find PTEP pointer for address ADDR.
+ */
static inline pte_t *get_ptep(struct mm_struct *mm, unsigned long addr)
{
pte_t *ptep = NULL;
@@ -406,6 +431,41 @@ static inline bool pte_needs_flush(pte_t
== (_PAGE_PRESENT | _PAGE_ACCESSED);
}
+/*
+ * Return user physical address. Returns 0 if page is not present.
+ */
+static inline unsigned long get_upa(struct mm_struct *mm, unsigned long addr)
+{
+ unsigned long flags, space, pgd, prot, pa;
+#ifdef CONFIG_TLB_PTLOCK
+ unsigned long pgd_lock;
+#endif
+
+ /* Save context */
+ local_irq_save(flags);
+ prot = mfctl(8);
+ space = mfsp(SR_USER);
+ pgd = mfctl(25);
+#ifdef CONFIG_TLB_PTLOCK
+ pgd_lock = mfctl(28);
+#endif
+
+ /* Set context for lpa_user */
+ switch_mm_irqs_off(NULL, mm, NULL);
+ pa = lpa_user(addr);
+
+ /* Restore previous context */
+#ifdef CONFIG_TLB_PTLOCK
+ mtctl(pgd_lock, 28);
+#endif
+ mtctl(pgd, 25);
+ mtsp(space, SR_USER);
+ mtctl(prot, 8);
+ local_irq_restore(flags);
+
+ return pa;
+}
+
void flush_dcache_folio(struct folio *folio)
{
struct address_space *mapping = folio_flush_mapping(folio);
@@ -454,50 +514,23 @@ void flush_dcache_folio(struct folio *fo
if (addr + nr * PAGE_SIZE > vma->vm_end)
nr = (vma->vm_end - addr) / PAGE_SIZE;
- if (parisc_requires_coherency()) {
- for (i = 0; i < nr; i++) {
- pte_t *ptep = get_ptep(vma->vm_mm,
- addr + i * PAGE_SIZE);
- if (!ptep)
- continue;
- if (pte_needs_flush(*ptep))
- flush_user_cache_page(vma,
- addr + i * PAGE_SIZE);
- /* Optimise accesses to the same table? */
- pte_unmap(ptep);
- }
- } else {
+ if (old_addr == 0 || (old_addr & (SHM_COLOUR - 1))
+ != (addr & (SHM_COLOUR - 1))) {
+ for (i = 0; i < nr; i++)
+ __flush_cache_page(vma,
+ addr + i * PAGE_SIZE,
+ (pfn + i) * PAGE_SIZE);
/*
- * The TLB is the engine of coherence on parisc:
- * The CPU is entitled to speculate any page
- * with a TLB mapping, so here we kill the
- * mapping then flush the page along a special
- * flush only alias mapping. This guarantees that
- * the page is no-longer in the cache for any
- * process and nor may it be speculatively read
- * in (until the user or kernel specifically
- * accesses it, of course)
+ * Software is allowed to have any number
+ * of private mappings to a page.
*/
- for (i = 0; i < nr; i++)
- flush_tlb_page(vma, addr + i * PAGE_SIZE);
- if (old_addr == 0 || (old_addr & (SHM_COLOUR - 1))
- != (addr & (SHM_COLOUR - 1))) {
- for (i = 0; i < nr; i++)
- __flush_cache_page(vma,
- addr + i * PAGE_SIZE,
- (pfn + i) * PAGE_SIZE);
- /*
- * Software is allowed to have any number
- * of private mappings to a page.
- */
- if (!(vma->vm_flags & VM_SHARED))
- continue;
- if (old_addr)
- pr_err("INEQUIVALENT ALIASES 0x%lx and 0x%lx in file %pD\n",
- old_addr, addr, vma->vm_file);
- if (nr == folio_nr_pages(folio))
- old_addr = addr;
- }
+ if (!(vma->vm_flags & VM_SHARED))
+ continue;
+ if (old_addr)
+ pr_err("INEQUIVALENT ALIASES 0x%lx and 0x%lx in file %pD\n",
+ old_addr, addr, vma->vm_file);
+ if (nr == folio_nr_pages(folio))
+ old_addr = addr;
}
WARN_ON(++count == 4096);
}
@@ -587,35 +620,28 @@ extern void purge_kernel_dcache_page_asm
extern void clear_user_page_asm(void *, unsigned long);
extern void copy_user_page_asm(void *, void *, unsigned long);
-void flush_kernel_dcache_page_addr(const void *addr)
-{
- unsigned long flags;
-
- flush_kernel_dcache_page_asm(addr);
- purge_tlb_start(flags);
- pdtlb(SR_KERNEL, addr);
- purge_tlb_end(flags);
-}
-EXPORT_SYMBOL(flush_kernel_dcache_page_addr);
-
static void flush_cache_page_if_present(struct vm_area_struct *vma,
- unsigned long vmaddr, unsigned long pfn)
+ unsigned long vmaddr)
{
+#if CONFIG_FLUSH_PAGE_ACCESSED
bool needs_flush = false;
- pte_t *ptep;
+ pte_t *ptep, pte;
- /*
- * The pte check is racy and sometimes the flush will trigger
- * a non-access TLB miss. Hopefully, the page has already been
- * flushed.
- */
ptep = get_ptep(vma->vm_mm, vmaddr);
if (ptep) {
- needs_flush = pte_needs_flush(*ptep);
+ pte = ptep_get(ptep);
+ needs_flush = pte_needs_flush(pte);
pte_unmap(ptep);
}
if (needs_flush)
- flush_cache_page(vma, vmaddr, pfn);
+ __flush_cache_page(vma, vmaddr, PFN_PHYS(pte_pfn(pte)));
+#else
+ struct mm_struct *mm = vma->vm_mm;
+ unsigned long physaddr = get_upa(mm, vmaddr);
+
+ if (physaddr)
+ __flush_cache_page(vma, vmaddr, PAGE_ALIGN_DOWN(physaddr));
+#endif
}
void copy_user_highpage(struct page *to, struct page *from,
@@ -625,7 +651,7 @@ void copy_user_highpage(struct page *to,
kfrom = kmap_local_page(from);
kto = kmap_local_page(to);
- flush_cache_page_if_present(vma, vaddr, page_to_pfn(from));
+ __flush_cache_page(vma, vaddr, PFN_PHYS(page_to_pfn(from)));
copy_page_asm(kto, kfrom);
kunmap_local(kto);
kunmap_local(kfrom);
@@ -634,16 +660,17 @@ void copy_user_highpage(struct page *to,
void copy_to_user_page(struct vm_area_struct *vma, struct page *page,
unsigned long user_vaddr, void *dst, void *src, int len)
{
- flush_cache_page_if_present(vma, user_vaddr, page_to_pfn(page));
+ __flush_cache_page(vma, user_vaddr, PFN_PHYS(page_to_pfn(page)));
memcpy(dst, src, len);
- flush_kernel_dcache_range_asm((unsigned long)dst, (unsigned long)dst + len);
+ flush_kernel_dcache_page_addr(PTR_PAGE_ALIGN_DOWN(dst));
}
void copy_from_user_page(struct vm_area_struct *vma, struct page *page,
unsigned long user_vaddr, void *dst, void *src, int len)
{
- flush_cache_page_if_present(vma, user_vaddr, page_to_pfn(page));
+ __flush_cache_page(vma, user_vaddr, PFN_PHYS(page_to_pfn(page)));
memcpy(dst, src, len);
+ flush_kernel_dcache_page_addr(PTR_PAGE_ALIGN_DOWN(src));
}
/* __flush_tlb_range()
@@ -677,32 +704,10 @@ int __flush_tlb_range(unsigned long sid,
static void flush_cache_pages(struct vm_area_struct *vma, unsigned long start, unsigned long end)
{
- unsigned long addr, pfn;
- pte_t *ptep;
+ unsigned long addr;
- for (addr = start; addr < end; addr += PAGE_SIZE) {
- bool needs_flush = false;
- /*
- * The vma can contain pages that aren't present. Although
- * the pte search is expensive, we need the pte to find the
- * page pfn and to check whether the page should be flushed.
- */
- ptep = get_ptep(vma->vm_mm, addr);
- if (ptep) {
- needs_flush = pte_needs_flush(*ptep);
- pfn = pte_pfn(*ptep);
- pte_unmap(ptep);
- }
- if (needs_flush) {
- if (parisc_requires_coherency()) {
- flush_user_cache_page(vma, addr);
- } else {
- if (WARN_ON(!pfn_valid(pfn)))
- return;
- __flush_cache_page(vma, addr, PFN_PHYS(pfn));
- }
- }
- }
+ for (addr = start; addr < end; addr += PAGE_SIZE)
+ flush_cache_page_if_present(vma, addr);
}
static inline unsigned long mm_total_size(struct mm_struct *mm)
@@ -753,21 +758,19 @@ void flush_cache_range(struct vm_area_st
if (WARN_ON(IS_ENABLED(CONFIG_SMP) && arch_irqs_disabled()))
return;
flush_tlb_range(vma, start, end);
- flush_cache_all();
+ if (vma->vm_flags & VM_EXEC)
+ flush_cache_all();
+ else
+ flush_data_cache();
return;
}
- flush_cache_pages(vma, start, end);
+ flush_cache_pages(vma, start & PAGE_MASK, end);
}
void flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long pfn)
{
- if (WARN_ON(!pfn_valid(pfn)))
- return;
- if (parisc_requires_coherency())
- flush_user_cache_page(vma, vmaddr);
- else
- __flush_cache_page(vma, vmaddr, PFN_PHYS(pfn));
+ __flush_cache_page(vma, vmaddr, PFN_PHYS(pfn));
}
void flush_anon_page(struct vm_area_struct *vma, struct page *page, unsigned long vmaddr)
@@ -775,34 +778,133 @@ void flush_anon_page(struct vm_area_stru
if (!PageAnon(page))
return;
- if (parisc_requires_coherency()) {
- if (vma->vm_flags & VM_SHARED)
- flush_data_cache();
- else
- flush_user_cache_page(vma, vmaddr);
+ __flush_cache_page(vma, vmaddr, PFN_PHYS(page_to_pfn(page)));
+}
+
+int ptep_clear_flush_young(struct vm_area_struct *vma, unsigned long addr,
+ pte_t *ptep)
+{
+ pte_t pte = ptep_get(ptep);
+
+ if (!pte_young(pte))
+ return 0;
+ set_pte(ptep, pte_mkold(pte));
+#if CONFIG_FLUSH_PAGE_ACCESSED
+ __flush_cache_page(vma, addr, PFN_PHYS(pte_pfn(pte)));
+#endif
+ return 1;
+}
+
+/*
+ * After a PTE is cleared, we have no way to flush the cache for
+ * the physical page. On PA8800 and PA8900 processors, these lines
+ * can cause random cache corruption. Thus, we must flush the cache
+ * as well as the TLB when clearing a PTE that's valid.
+ */
+pte_t ptep_clear_flush(struct vm_area_struct *vma, unsigned long addr,
+ pte_t *ptep)
+{
+ struct mm_struct *mm = (vma)->vm_mm;
+ pte_t pte = ptep_get_and_clear(mm, addr, ptep);
+ unsigned long pfn = pte_pfn(pte);
+
+ if (pfn_valid(pfn))
+ __flush_cache_page(vma, addr, PFN_PHYS(pfn));
+ else if (pte_accessible(mm, pte))
+ flush_tlb_page(vma, addr);
+
+ return pte;
+}
+
+/*
+ * The physical address for pages in the ioremap case can be obtained
+ * from the vm_struct struct. I wasn't able to successfully handle the
+ * vmalloc and vmap cases. We have an array of struct page pointers in
+ * the uninitialized vmalloc case but the flush failed using page_to_pfn.
+ */
+void flush_cache_vmap(unsigned long start, unsigned long end)
+{
+ unsigned long addr, physaddr;
+ struct vm_struct *vm;
+
+ /* Prevent cache move-in */
+ flush_tlb_kernel_range(start, end);
+
+ if (end - start >= parisc_cache_flush_threshold) {
+ flush_cache_all();
return;
}
- flush_tlb_page(vma, vmaddr);
- preempt_disable();
- flush_dcache_page_asm(page_to_phys(page), vmaddr);
- preempt_enable();
+ if (WARN_ON_ONCE(!is_vmalloc_addr((void *)start))) {
+ flush_cache_all();
+ return;
+ }
+
+ vm = find_vm_area((void *)start);
+ if (WARN_ON_ONCE(!vm)) {
+ flush_cache_all();
+ return;
+ }
+
+ /* The physical addresses of IOREMAP regions are contiguous */
+ if (vm->flags & VM_IOREMAP) {
+ physaddr = vm->phys_addr;
+ for (addr = start; addr < end; addr += PAGE_SIZE) {
+ preempt_disable();
+ flush_dcache_page_asm(physaddr, start);
+ flush_icache_page_asm(physaddr, start);
+ preempt_enable();
+ physaddr += PAGE_SIZE;
+ }
+ return;
+ }
+
+ flush_cache_all();
}
+EXPORT_SYMBOL(flush_cache_vmap);
+/*
+ * The vm_struct has been retired and the page table is set up. The
+ * last page in the range is a guard page. Its physical address can't
+ * be determined using lpa, so there is no way to flush the range
+ * using flush_dcache_page_asm.
+ */
+void flush_cache_vunmap(unsigned long start, unsigned long end)
+{
+ /* Prevent cache move-in */
+ flush_tlb_kernel_range(start, end);
+ flush_data_cache();
+}
+EXPORT_SYMBOL(flush_cache_vunmap);
+
+/*
+ * On systems with PA8800/PA8900 processors, there is no way to flush
+ * a vmap range other than using the architected loop to flush the
+ * entire cache. The page directory is not set up, so we can't use
+ * fdc, etc. FDCE/FICE don't work to flush a portion of the cache.
+ * L2 is physically indexed but FDCE/FICE instructions in virtual
+ * mode output their virtual address on the core bus, not their
+ * real address. As a result, the L2 cache index formed from the
+ * virtual address will most likely not be the same as the L2 index
+ * formed from the real address.
+ */
void flush_kernel_vmap_range(void *vaddr, int size)
{
unsigned long start = (unsigned long)vaddr;
unsigned long end = start + size;
- if ((!IS_ENABLED(CONFIG_SMP) || !arch_irqs_disabled()) &&
- (unsigned long)size >= parisc_cache_flush_threshold) {
- flush_tlb_kernel_range(start, end);
- flush_data_cache();
+ flush_tlb_kernel_range(start, end);
+
+ if (!static_branch_likely(&parisc_has_dcache))
+ return;
+
+ /* If interrupts are disabled, we can only do local flush */
+ if (WARN_ON(IS_ENABLED(CONFIG_SMP) && arch_irqs_disabled())) {
+ flush_data_cache_local(NULL);
return;
}
- flush_kernel_dcache_range_asm(start, end);
- flush_tlb_kernel_range(start, end);
+ flush_data_cache();
}
EXPORT_SYMBOL(flush_kernel_vmap_range);
@@ -814,15 +916,18 @@ void invalidate_kernel_vmap_range(void *
/* Ensure DMA is complete */
asm_syncdma();
- if ((!IS_ENABLED(CONFIG_SMP) || !arch_irqs_disabled()) &&
- (unsigned long)size >= parisc_cache_flush_threshold) {
- flush_tlb_kernel_range(start, end);
- flush_data_cache();
+ flush_tlb_kernel_range(start, end);
+
+ if (!static_branch_likely(&parisc_has_dcache))
+ return;
+
+ /* If interrupts are disabled, we can only do local flush */
+ if (WARN_ON(IS_ENABLED(CONFIG_SMP) && arch_irqs_disabled())) {
+ flush_data_cache_local(NULL);
return;
}
- purge_kernel_dcache_range_asm(start, end);
- flush_tlb_kernel_range(start, end);
+ flush_data_cache();
}
EXPORT_SYMBOL(invalidate_kernel_vmap_range);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 182/267] ACPI: x86: Force StorageD3Enable on more products
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (180 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 181/267] parisc: Try to fix random segmentation faults in package builds Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 183/267] drm/exynos/vidi: fix memory leak in .get_modes() Greg Kroah-Hartman
` (96 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Mario Limonciello, Hans de Goede,
Rafael J. Wysocki
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mario Limonciello <mario.limonciello@amd.com>
commit e79a10652bbd320649da705ca1ea0c04351af403 upstream.
A Rembrandt-based HP thin client is reported to have problems where
the NVME disk isn't present after resume from s2idle.
This is because the NVME disk wasn't put into D3 at suspend, and
that happened because the StorageD3Enable _DSD was missing in the BIOS.
As AMD's architecture requires that the NVME is in D3 for s2idle, adjust
the criteria for force_storage_d3 to match *all* Zen SoCs when the FADT
advertises low power idle support.
This will ensure that any future products with this BIOS deficiency don't
need to be added to the allow list of overrides.
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/acpi/x86/utils.c | 24 ++++++++++--------------
1 file changed, 10 insertions(+), 14 deletions(-)
--- a/drivers/acpi/x86/utils.c
+++ b/drivers/acpi/x86/utils.c
@@ -198,16 +198,16 @@ bool acpi_device_override_status(struct
}
/*
- * AMD systems from Renoir and Lucienne *require* that the NVME controller
+ * AMD systems from Renoir onwards *require* that the NVME controller
* is put into D3 over a Modern Standby / suspend-to-idle cycle.
*
* This is "typically" accomplished using the `StorageD3Enable`
* property in the _DSD that is checked via the `acpi_storage_d3` function
- * but this property was introduced after many of these systems launched
- * and most OEM systems don't have it in their BIOS.
+ * but some OEM systems still don't have it in their BIOS.
*
* The Microsoft documentation for StorageD3Enable mentioned that Windows has
- * a hardcoded allowlist for D3 support, which was used for these platforms.
+ * a hardcoded allowlist for D3 support as well as a registry key to override
+ * the BIOS, which has been used for these cases.
*
* This allows quirking on Linux in a similar fashion.
*
@@ -220,19 +220,15 @@ bool acpi_device_override_status(struct
* https://bugzilla.kernel.org/show_bug.cgi?id=216773
* https://bugzilla.kernel.org/show_bug.cgi?id=217003
* 2) On at least one HP system StorageD3Enable is missing on the second NVME
- disk in the system.
+ * disk in the system.
+ * 3) On at least one HP Rembrandt system StorageD3Enable is missing on the only
+ * NVME device.
*/
-static const struct x86_cpu_id storage_d3_cpu_ids[] = {
- X86_MATCH_VENDOR_FAM_MODEL(AMD, 23, 24, NULL), /* Picasso */
- X86_MATCH_VENDOR_FAM_MODEL(AMD, 23, 96, NULL), /* Renoir */
- X86_MATCH_VENDOR_FAM_MODEL(AMD, 23, 104, NULL), /* Lucienne */
- X86_MATCH_VENDOR_FAM_MODEL(AMD, 25, 80, NULL), /* Cezanne */
- {}
-};
-
bool force_storage_d3(void)
{
- return x86_match_cpu(storage_d3_cpu_ids);
+ if (!cpu_feature_enabled(X86_FEATURE_ZEN))
+ return false;
+ return acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0;
}
/*
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 183/267] drm/exynos/vidi: fix memory leak in .get_modes()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (181 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 182/267] ACPI: x86: Force StorageD3Enable on more products Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 184/267] drm/exynos: hdmi: report safe 640x480 mode as a fallback when no EDID found Greg Kroah-Hartman
` (95 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Jani Nikula, Inki Dae
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jani Nikula <jani.nikula@intel.com>
commit 38e3825631b1f314b21e3ade00b5a4d737eb054e upstream.
The duplicated EDID is never freed. Fix it.
Cc: stable@vger.kernel.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/gpu/drm/exynos/exynos_drm_vidi.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/exynos/exynos_drm_vidi.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_vidi.c
@@ -309,6 +309,7 @@ static int vidi_get_modes(struct drm_con
struct vidi_context *ctx = ctx_from_connector(connector);
struct edid *edid;
int edid_len;
+ int count;
/*
* the edid data comes from user side and it would be set
@@ -328,7 +329,11 @@ static int vidi_get_modes(struct drm_con
drm_connector_update_edid_property(connector, edid);
- return drm_add_edid_modes(connector, edid);
+ count = drm_add_edid_modes(connector, edid);
+
+ kfree(edid);
+
+ return count;
}
static const struct drm_connector_helper_funcs vidi_connector_helper_funcs = {
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 184/267] drm/exynos: hdmi: report safe 640x480 mode as a fallback when no EDID found
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (182 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 183/267] drm/exynos/vidi: fix memory leak in .get_modes() Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 185/267] mptcp: ensure snd_una is properly initialized on connect Greg Kroah-Hartman
` (94 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Marek Szyprowski, Inki Dae
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marek Szyprowski <m.szyprowski@samsung.com>
commit 799d4b392417ed6889030a5b2335ccb6dcf030ab upstream.
When reading EDID fails and driver reports no modes available, the DRM
core adds an artificial 1024x786 mode to the connector. Unfortunately
some variants of the Exynos HDMI (like the one in Exynos4 SoCs) are not
able to drive such mode, so report a safe 640x480 mode instead of nothing
in case of the EDID reading failure.
This fixes the following issue observed on Trats2 board since commit
13d5b040363c ("drm/exynos: do not return negative values from .get_modes()"):
[drm] Exynos DRM: using 11c00000.fimd device for DMA mapping operations
exynos-drm exynos-drm: bound 11c00000.fimd (ops fimd_component_ops)
exynos-drm exynos-drm: bound 12c10000.mixer (ops mixer_component_ops)
exynos-dsi 11c80000.dsi: [drm:samsung_dsim_host_attach] Attached s6e8aa0 device (lanes:4 bpp:24 mode-flags:0x10b)
exynos-drm exynos-drm: bound 11c80000.dsi (ops exynos_dsi_component_ops)
exynos-drm exynos-drm: bound 12d00000.hdmi (ops hdmi_component_ops)
[drm] Initialized exynos 1.1.0 20180330 for exynos-drm on minor 1
exynos-hdmi 12d00000.hdmi: [drm:hdmiphy_enable.part.0] *ERROR* PLL could not reach steady state
panel-samsung-s6e8aa0 11c80000.dsi.0: ID: 0xa2, 0x20, 0x8c
exynos-mixer 12c10000.mixer: timeout waiting for VSYNC
------------[ cut here ]------------
WARNING: CPU: 1 PID: 11 at drivers/gpu/drm/drm_atomic_helper.c:1682 drm_atomic_helper_wait_for_vblanks.part.0+0x2b0/0x2b8
[CRTC:70:crtc-1] vblank wait timed out
Modules linked in:
CPU: 1 PID: 11 Comm: kworker/u16:0 Not tainted 6.9.0-rc5-next-20240424 #14913
Hardware name: Samsung Exynos (Flattened Device Tree)
Workqueue: events_unbound deferred_probe_work_func
Call trace:
unwind_backtrace from show_stack+0x10/0x14
show_stack from dump_stack_lvl+0x68/0x88
dump_stack_lvl from __warn+0x7c/0x1c4
__warn from warn_slowpath_fmt+0x11c/0x1a8
warn_slowpath_fmt from drm_atomic_helper_wait_for_vblanks.part.0+0x2b0/0x2b8
drm_atomic_helper_wait_for_vblanks.part.0 from drm_atomic_helper_commit_tail_rpm+0x7c/0x8c
drm_atomic_helper_commit_tail_rpm from commit_tail+0x9c/0x184
commit_tail from drm_atomic_helper_commit+0x168/0x190
drm_atomic_helper_commit from drm_atomic_commit+0xb4/0xe0
drm_atomic_commit from drm_client_modeset_commit_atomic+0x23c/0x27c
drm_client_modeset_commit_atomic from drm_client_modeset_commit_locked+0x60/0x1cc
drm_client_modeset_commit_locked from drm_client_modeset_commit+0x24/0x40
drm_client_modeset_commit from __drm_fb_helper_restore_fbdev_mode_unlocked+0x9c/0xc4
__drm_fb_helper_restore_fbdev_mode_unlocked from drm_fb_helper_set_par+0x2c/0x3c
drm_fb_helper_set_par from fbcon_init+0x3d8/0x550
fbcon_init from visual_init+0xc0/0x108
visual_init from do_bind_con_driver+0x1b8/0x3a4
do_bind_con_driver from do_take_over_console+0x140/0x1ec
do_take_over_console from do_fbcon_takeover+0x70/0xd0
do_fbcon_takeover from fbcon_fb_registered+0x19c/0x1ac
fbcon_fb_registered from register_framebuffer+0x190/0x21c
register_framebuffer from __drm_fb_helper_initial_config_and_unlock+0x350/0x574
__drm_fb_helper_initial_config_and_unlock from exynos_drm_fbdev_client_hotplug+0x6c/0xb0
exynos_drm_fbdev_client_hotplug from drm_client_register+0x58/0x94
drm_client_register from exynos_drm_bind+0x160/0x190
exynos_drm_bind from try_to_bring_up_aggregate_device+0x200/0x2d8
try_to_bring_up_aggregate_device from __component_add+0xb0/0x170
__component_add from mixer_probe+0x74/0xcc
mixer_probe from platform_probe+0x5c/0xb8
platform_probe from really_probe+0xe0/0x3d8
really_probe from __driver_probe_device+0x9c/0x1e4
__driver_probe_device from driver_probe_device+0x30/0xc0
driver_probe_device from __device_attach_driver+0xa8/0x120
__device_attach_driver from bus_for_each_drv+0x80/0xcc
bus_for_each_drv from __device_attach+0xac/0x1fc
__device_attach from bus_probe_device+0x8c/0x90
bus_probe_device from deferred_probe_work_func+0x98/0xe0
deferred_probe_work_func from process_one_work+0x240/0x6d0
process_one_work from worker_thread+0x1a0/0x3f4
worker_thread from kthread+0x104/0x138
kthread from ret_from_fork+0x14/0x28
Exception stack(0xf0895fb0 to 0xf0895ff8)
...
irq event stamp: 82357
hardirqs last enabled at (82363): [<c01a96e8>] vprintk_emit+0x308/0x33c
hardirqs last disabled at (82368): [<c01a969c>] vprintk_emit+0x2bc/0x33c
softirqs last enabled at (81614): [<c0101644>] __do_softirq+0x320/0x500
softirqs last disabled at (81609): [<c012dfe0>] __irq_exit_rcu+0x130/0x184
---[ end trace 0000000000000000 ]---
exynos-drm exynos-drm: [drm] *ERROR* flip_done timed out
exynos-drm exynos-drm: [drm] *ERROR* [CRTC:70:crtc-1] commit wait timed out
exynos-drm exynos-drm: [drm] *ERROR* flip_done timed out
exynos-drm exynos-drm: [drm] *ERROR* [CONNECTOR:74:HDMI-A-1] commit wait timed out
exynos-drm exynos-drm: [drm] *ERROR* flip_done timed out
exynos-drm exynos-drm: [drm] *ERROR* [PLANE:56:plane-5] commit wait timed out
exynos-mixer 12c10000.mixer: timeout waiting for VSYNC
Cc: stable@vger.kernel.org
Fixes: 13d5b040363c ("drm/exynos: do not return negative values from .get_modes()")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/gpu/drm/exynos/exynos_hdmi.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/exynos/exynos_hdmi.c
+++ b/drivers/gpu/drm/exynos/exynos_hdmi.c
@@ -887,11 +887,11 @@ static int hdmi_get_modes(struct drm_con
int ret;
if (!hdata->ddc_adpt)
- return 0;
+ goto no_edid;
edid = drm_get_edid(connector, hdata->ddc_adpt);
if (!edid)
- return 0;
+ goto no_edid;
hdata->dvi_mode = !connector->display_info.is_hdmi;
DRM_DEV_DEBUG_KMS(hdata->dev, "%s : width[%d] x height[%d]\n",
@@ -906,6 +906,9 @@ static int hdmi_get_modes(struct drm_con
kfree(edid);
return ret;
+
+no_edid:
+ return drm_add_modes_noedid(connector, 640, 480);
}
static int hdmi_find_phy_conf(struct hdmi_context *hdata, u32 pixel_clock)
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 185/267] mptcp: ensure snd_una is properly initialized on connect
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (183 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 184/267] drm/exynos: hdmi: report safe 640x480 mode as a fallback when no EDID found Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 186/267] mptcp: pm: inc RmAddr MIB counter once per RM_ADDR ID Greg Kroah-Hartman
` (93 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Mat Martineau, Christoph Paasch,
Paolo Abeni, Matthieu Baerts (NGI0), Jakub Kicinski
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paolo Abeni <pabeni@redhat.com>
commit 8031b58c3a9b1db3ef68b3bd749fbee2e1e1aaa3 upstream.
This is strictly related to commit fb7a0d334894 ("mptcp: ensure snd_nxt
is properly initialized on connect"). It turns out that syzkaller can
trigger the retransmit after fallback and before processing any other
incoming packet - so that snd_una is still left uninitialized.
Address the issue explicitly initializing snd_una together with snd_nxt
and write_seq.
Suggested-by: Mat Martineau <martineau@kernel.org>
Fixes: 8fd738049ac3 ("mptcp: fallback in case of simultaneous connect")
Cc: stable@vger.kernel.org
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/485
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-1-1ab9ddfa3d00@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/mptcp/protocol.c | 1 +
1 file changed, 1 insertion(+)
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3706,6 +3706,7 @@ static int mptcp_connect(struct sock *sk
WRITE_ONCE(msk->write_seq, subflow->idsn);
WRITE_ONCE(msk->snd_nxt, subflow->idsn);
+ WRITE_ONCE(msk->snd_una, subflow->idsn);
if (likely(!__mptcp_check_fallback(msk)))
MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_MPCAPABLEACTIVE);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 186/267] mptcp: pm: inc RmAddr MIB counter once per RM_ADDR ID
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (184 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 185/267] mptcp: ensure snd_una is properly initialized on connect Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 187/267] mptcp: pm: update add_addr counters after connect Greg Kroah-Hartman
` (92 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Matthieu Baerts (NGI0), YonglongLi,
Jakub Kicinski
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: YonglongLi <liyonglong@chinatelecom.cn>
commit 6a09788c1a66e3d8b04b3b3e7618cc817bb60ae9 upstream.
The RmAddr MIB counter is supposed to be incremented once when a valid
RM_ADDR has been received. Before this patch, it could have been
incremented as many times as the number of subflows connected to the
linked address ID, so it could have been 0, 1 or more than 1.
The "RmSubflow" is incremented after a local operation. In this case,
it is normal to tied it with the number of subflows that have been
actually removed.
The "remove invalid addresses" MP Join subtest has been modified to
validate this case. A broadcast IP address is now used instead: the
client will not be able to create a subflow to this address. The
consequence is that when receiving the RM_ADDR with the ID attached to
this broadcast IP address, no subflow linked to this ID will be found.
Fixes: 7a7e52e38a40 ("mptcp: add RM_ADDR related mibs")
Cc: stable@vger.kernel.org
Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: YonglongLi <liyonglong@chinatelecom.cn>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-2-1ab9ddfa3d00@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/mptcp/pm_netlink.c | 5 ++++-
tools/testing/selftests/net/mptcp/mptcp_join.sh | 3 ++-
2 files changed, 6 insertions(+), 2 deletions(-)
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -822,10 +822,13 @@ static void mptcp_pm_nl_rm_addr_or_subfl
spin_lock_bh(&msk->pm.lock);
removed = true;
- __MPTCP_INC_STATS(sock_net(sk), rm_type);
+ if (rm_type == MPTCP_MIB_RMSUBFLOW)
+ __MPTCP_INC_STATS(sock_net(sk), rm_type);
}
if (rm_type == MPTCP_MIB_RMSUBFLOW)
__set_bit(rm_id ? rm_id : msk->mpc_endpoint_id, msk->pm.id_avail_bitmap);
+ else if (rm_type == MPTCP_MIB_RMADDR)
+ __MPTCP_INC_STATS(sock_net(sk), rm_type);
if (!removed)
continue;
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -2394,7 +2394,8 @@ remove_tests()
pm_nl_set_limits $ns1 3 3
pm_nl_add_endpoint $ns1 10.0.12.1 flags signal
pm_nl_add_endpoint $ns1 10.0.3.1 flags signal
- pm_nl_add_endpoint $ns1 10.0.14.1 flags signal
+ # broadcast IP: no packet for this address will be received on ns1
+ pm_nl_add_endpoint $ns1 224.0.0.1 flags signal
pm_nl_set_limits $ns2 3 3
addr_nr_ns1=-3 speed=10 \
run_tests $ns1 $ns2 10.0.1.1
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 187/267] mptcp: pm: update add_addr counters after connect
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (185 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 186/267] mptcp: pm: inc RmAddr MIB counter once per RM_ADDR ID Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 188/267] clkdev: Update clkdev id usage to allow for longer names Greg Kroah-Hartman
` (91 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Matthieu Baerts (NGI0), YonglongLi,
Mat Martineau, Jakub Kicinski
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: YonglongLi <liyonglong@chinatelecom.cn>
commit 40eec1795cc27b076d49236649a29507c7ed8c2d upstream.
The creation of new subflows can fail for different reasons. If no
subflow have been created using the received ADD_ADDR, the related
counters should not be updated, otherwise they will never be decremented
for events related to this ID later on.
For the moment, the number of accepted ADD_ADDR is only decremented upon
the reception of a related RM_ADDR, and only if the remote address ID is
currently being used by at least one subflow. In other words, if no
subflow can be created with the received address, the counter will not
be decremented. In this case, it is then important not to increment
pm.add_addr_accepted counter, and not to modify pm.accept_addr bit.
Note that this patch does not modify the behaviour in case of failures
later on, e.g. if the MP Join is dropped or rejected.
The "remove invalid addresses" MP Join subtest has been modified to
validate this case. The broadcast IP address is added before the "valid"
address that will be used to successfully create a subflow, and the
limit is decreased by one: without this patch, it was not possible to
create the last subflow, because:
- the broadcast address would have been accepted even if it was not
usable: the creation of a subflow to this address results in an error,
- the limit of 2 accepted ADD_ADDR would have then been reached.
Fixes: 01cacb00b35c ("mptcp: add netlink-based PM")
Cc: stable@vger.kernel.org
Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: YonglongLi <liyonglong@chinatelecom.cn>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-3-1ab9ddfa3d00@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/mptcp/pm_netlink.c | 16 ++++++++++------
tools/testing/selftests/net/mptcp/mptcp_join.sh | 4 ++--
2 files changed, 12 insertions(+), 8 deletions(-)
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -685,6 +685,7 @@ static void mptcp_pm_nl_add_addr_receive
unsigned int add_addr_accept_max;
struct mptcp_addr_info remote;
unsigned int subflows_max;
+ bool sf_created = false;
int i, nr;
add_addr_accept_max = mptcp_pm_get_add_addr_accept_max(msk);
@@ -712,15 +713,18 @@ static void mptcp_pm_nl_add_addr_receive
if (nr == 0)
return;
- msk->pm.add_addr_accepted++;
- if (msk->pm.add_addr_accepted >= add_addr_accept_max ||
- msk->pm.subflows >= subflows_max)
- WRITE_ONCE(msk->pm.accept_addr, false);
-
spin_unlock_bh(&msk->pm.lock);
for (i = 0; i < nr; i++)
- __mptcp_subflow_connect(sk, &addrs[i], &remote);
+ if (__mptcp_subflow_connect(sk, &addrs[i], &remote) == 0)
+ sf_created = true;
spin_lock_bh(&msk->pm.lock);
+
+ if (sf_created) {
+ msk->pm.add_addr_accepted++;
+ if (msk->pm.add_addr_accepted >= add_addr_accept_max ||
+ msk->pm.subflows >= subflows_max)
+ WRITE_ONCE(msk->pm.accept_addr, false);
+ }
}
void mptcp_pm_nl_addr_send_ack(struct mptcp_sock *msk)
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -2393,10 +2393,10 @@ remove_tests()
if reset "remove invalid addresses"; then
pm_nl_set_limits $ns1 3 3
pm_nl_add_endpoint $ns1 10.0.12.1 flags signal
- pm_nl_add_endpoint $ns1 10.0.3.1 flags signal
# broadcast IP: no packet for this address will be received on ns1
pm_nl_add_endpoint $ns1 224.0.0.1 flags signal
- pm_nl_set_limits $ns2 3 3
+ pm_nl_add_endpoint $ns1 10.0.3.1 flags signal
+ pm_nl_set_limits $ns2 2 2
addr_nr_ns1=-3 speed=10 \
run_tests $ns1 $ns2 10.0.1.1
chk_join_nr 1 1 1
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 188/267] clkdev: Update clkdev id usage to allow for longer names
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (186 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 187/267] mptcp: pm: update add_addr counters after connect Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 189/267] irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update() Greg Kroah-Hartman
` (90 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Russell King (Oracle),
Michael J. Ruhl, Andy Shevchenko, Stephen Boyd, Guenter Roeck
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael J. Ruhl <michael.j.ruhl@intel.com>
commit 99f4570cfba1e60daafde737cb7e395006d719e6 upstream.
clkdev DEV ID information is limited to an array of 20 bytes
(MAX_DEV_ID). It is possible that the ID could be longer than
that. If so, the lookup will fail because the "real ID" will
not match the copied value.
For instance, generating a device name for the I2C Designware
module using the PCI ID can result in a name of:
i2c_designware.39424
clkdev_create() will store:
i2c_designware.3942
The stored name is one off and will not match correctly during probe.
Increase the size of the ID to allow for a longer name.
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://lore.kernel.org/r/20240223202556.2194021-1-michael.j.ruhl@intel.com
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/clk/clkdev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/clk/clkdev.c
+++ b/drivers/clk/clkdev.c
@@ -144,7 +144,7 @@ void clkdev_add_table(struct clk_lookup
mutex_unlock(&clocks_mutex);
}
-#define MAX_DEV_ID 20
+#define MAX_DEV_ID 24
#define MAX_CON_ID 16
struct clk_lookup_alloc {
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 189/267] irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (187 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 188/267] clkdev: Update clkdev id usage to allow for longer names Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 190/267] x86/kexec: Fix bug with call depth tracking Greg Kroah-Hartman
` (89 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Marc Zyngier, Hagar Hemdan,
Thomas Gleixner
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hagar Hemdan <hagarhem@amazon.com>
commit b97e8a2f7130a4b30d1502003095833d16c028b3 upstream.
its_vlpi_prop_update() calls lpi_write_config() which obtains the
mapping information for a VLPI without lock held. So it could race
with its_vlpi_unmap().
Since all calls from its_irq_set_vcpu_affinity() require the same
lock to be held, hoist the locking there instead of sprinkling the
locking all over the place.
This bug was discovered using Coverity Static Analysis Security Testing
(SAST) by Synopsys, Inc.
[ tglx: Use guard() instead of goto ]
Fixes: 015ec0386ab6 ("irqchip/gic-v3-its: Add VLPI configuration handling")
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Hagar Hemdan <hagarhem@amazon.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240531162144.28650-1-hagarhem@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/irqchip/irq-gic-v3-its.c | 44 ++++++++++-----------------------------
1 file changed, 12 insertions(+), 32 deletions(-)
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1840,28 +1840,22 @@ static int its_vlpi_map(struct irq_data
{
struct its_device *its_dev = irq_data_get_irq_chip_data(d);
u32 event = its_get_event_id(d);
- int ret = 0;
if (!info->map)
return -EINVAL;
- raw_spin_lock(&its_dev->event_map.vlpi_lock);
-
if (!its_dev->event_map.vm) {
struct its_vlpi_map *maps;
maps = kcalloc(its_dev->event_map.nr_lpis, sizeof(*maps),
GFP_ATOMIC);
- if (!maps) {
- ret = -ENOMEM;
- goto out;
- }
+ if (!maps)
+ return -ENOMEM;
its_dev->event_map.vm = info->map->vm;
its_dev->event_map.vlpi_maps = maps;
} else if (its_dev->event_map.vm != info->map->vm) {
- ret = -EINVAL;
- goto out;
+ return -EINVAL;
}
/* Get our private copy of the mapping information */
@@ -1893,46 +1887,32 @@ static int its_vlpi_map(struct irq_data
its_dev->event_map.nr_vlpis++;
}
-out:
- raw_spin_unlock(&its_dev->event_map.vlpi_lock);
- return ret;
+ return 0;
}
static int its_vlpi_get(struct irq_data *d, struct its_cmd_info *info)
{
struct its_device *its_dev = irq_data_get_irq_chip_data(d);
struct its_vlpi_map *map;
- int ret = 0;
-
- raw_spin_lock(&its_dev->event_map.vlpi_lock);
map = get_vlpi_map(d);
- if (!its_dev->event_map.vm || !map) {
- ret = -EINVAL;
- goto out;
- }
+ if (!its_dev->event_map.vm || !map)
+ return -EINVAL;
/* Copy our mapping information to the incoming request */
*info->map = *map;
-out:
- raw_spin_unlock(&its_dev->event_map.vlpi_lock);
- return ret;
+ return 0;
}
static int its_vlpi_unmap(struct irq_data *d)
{
struct its_device *its_dev = irq_data_get_irq_chip_data(d);
u32 event = its_get_event_id(d);
- int ret = 0;
- raw_spin_lock(&its_dev->event_map.vlpi_lock);
-
- if (!its_dev->event_map.vm || !irqd_is_forwarded_to_vcpu(d)) {
- ret = -EINVAL;
- goto out;
- }
+ if (!its_dev->event_map.vm || !irqd_is_forwarded_to_vcpu(d))
+ return -EINVAL;
/* Drop the virtual mapping */
its_send_discard(its_dev, event);
@@ -1956,9 +1936,7 @@ static int its_vlpi_unmap(struct irq_dat
kfree(its_dev->event_map.vlpi_maps);
}
-out:
- raw_spin_unlock(&its_dev->event_map.vlpi_lock);
- return ret;
+ return 0;
}
static int its_vlpi_prop_update(struct irq_data *d, struct its_cmd_info *info)
@@ -1986,6 +1964,8 @@ static int its_irq_set_vcpu_affinity(str
if (!is_v4(its_dev->its))
return -EINVAL;
+ guard(raw_spinlock_irq)(&its_dev->event_map.vlpi_lock);
+
/* Unmap request? */
if (!info)
return its_vlpi_unmap(d);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 190/267] x86/kexec: Fix bug with call depth tracking
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (188 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 189/267] irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update() Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 191/267] x86/amd_nb: Check for invalid SMN reads Greg Kroah-Hartman
` (88 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, David Kaplan, Borislav Petkov (AMD),
Tom Lendacky, stable
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Kaplan <david.kaplan@amd.com>
commit 93c1800b3799f17375989b0daf76497dd3e80922 upstream.
The call to cc_platform_has() triggers a fault and system crash if call depth
tracking is active because the GS segment has been reset by load_segments() and
GS_BASE is now 0 but call depth tracking uses per-CPU variables to operate.
Call cc_platform_has() earlier in the function when GS is still valid.
[ bp: Massage. ]
Fixes: 5d8213864ade ("x86/retbleed: Add SKL return thunk")
Signed-off-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20240603083036.637-1-bp@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/x86/kernel/machine_kexec_64.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -298,8 +298,15 @@ void machine_kexec_cleanup(struct kimage
void machine_kexec(struct kimage *image)
{
unsigned long page_list[PAGES_NR];
- void *control_page;
+ unsigned int host_mem_enc_active;
int save_ftrace_enabled;
+ void *control_page;
+
+ /*
+ * This must be done before load_segments() since if call depth tracking
+ * is used then GS must be valid to make any function calls.
+ */
+ host_mem_enc_active = cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT);
#ifdef CONFIG_KEXEC_JUMP
if (image->preserve_context)
@@ -361,7 +368,7 @@ void machine_kexec(struct kimage *image)
(unsigned long)page_list,
image->start,
image->preserve_context,
- cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT));
+ host_mem_enc_active);
#ifdef CONFIG_KEXEC_JUMP
if (image->preserve_context)
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 191/267] x86/amd_nb: Check for invalid SMN reads
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (189 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 190/267] x86/kexec: Fix bug with call depth tracking Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 192/267] perf/core: Fix missing wakeup when waiting for context reference Greg Kroah-Hartman
` (87 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Yazen Ghannam, Borislav Petkov (AMD)
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yazen Ghannam <yazen.ghannam@amd.com>
commit c625dabbf1c4a8e77e4734014f2fde7aa9071a1f upstream.
AMD Zen-based systems use a System Management Network (SMN) that
provides access to implementation-specific registers.
SMN accesses are done indirectly through an index/data pair in PCI
config space. The PCI config access may fail and return an error code.
This would prevent the "read" value from being updated.
However, the PCI config access may succeed, but the return value may be
invalid. This is in similar fashion to PCI bad reads, i.e. return all
bits set.
Most systems will return 0 for SMN addresses that are not accessible.
This is in line with AMD convention that unavailable registers are
Read-as-Zero/Writes-Ignored.
However, some systems will return a "PCI Error Response" instead. This
value, along with an error code of 0 from the PCI config access, will
confuse callers of the amd_smn_read() function.
Check for this condition, clear the return value, and set a proper error
code.
Fixes: ddfe43cdc0da ("x86/amd_nb: Add SMN and Indirect Data Fabric access for AMD Fam17h")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230403164244.471141-1-yazen.ghannam@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/x86/kernel/amd_nb.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
--- a/arch/x86/kernel/amd_nb.c
+++ b/arch/x86/kernel/amd_nb.c
@@ -209,7 +209,14 @@ out:
int amd_smn_read(u16 node, u32 address, u32 *value)
{
- return __amd_smn_rw(node, address, value, false);
+ int err = __amd_smn_rw(node, address, value, false);
+
+ if (PCI_POSSIBLE_ERROR(*value)) {
+ err = -ENODEV;
+ *value = 0;
+ }
+
+ return err;
}
EXPORT_SYMBOL_GPL(amd_smn_read);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 192/267] perf/core: Fix missing wakeup when waiting for context reference
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (190 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 191/267] x86/amd_nb: Check for invalid SMN reads Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 193/267] perf auxtrace: Fix multiple use of --itrace option Greg Kroah-Hartman
` (86 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Haifeng Xu, Peter Zijlstra (Intel),
Frederic Weisbecker, Mark Rutland
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Haifeng Xu <haifeng.xu@shopee.com>
commit 74751ef5c1912ebd3e65c3b65f45587e05ce5d36 upstream.
In our production environment, we found many hung tasks which are
blocked for more than 18 hours. Their call traces are like this:
[346278.191038] __schedule+0x2d8/0x890
[346278.191046] schedule+0x4e/0xb0
[346278.191049] perf_event_free_task+0x220/0x270
[346278.191056] ? init_wait_var_entry+0x50/0x50
[346278.191060] copy_process+0x663/0x18d0
[346278.191068] kernel_clone+0x9d/0x3d0
[346278.191072] __do_sys_clone+0x5d/0x80
[346278.191076] __x64_sys_clone+0x25/0x30
[346278.191079] do_syscall_64+0x5c/0xc0
[346278.191083] ? syscall_exit_to_user_mode+0x27/0x50
[346278.191086] ? do_syscall_64+0x69/0xc0
[346278.191088] ? irqentry_exit_to_user_mode+0x9/0x20
[346278.191092] ? irqentry_exit+0x19/0x30
[346278.191095] ? exc_page_fault+0x89/0x160
[346278.191097] ? asm_exc_page_fault+0x8/0x30
[346278.191102] entry_SYSCALL_64_after_hwframe+0x44/0xae
The task was waiting for the refcount become to 1, but from the vmcore,
we found the refcount has already been 1. It seems that the task didn't
get woken up by perf_event_release_kernel() and got stuck forever. The
below scenario may cause the problem.
Thread A Thread B
... ...
perf_event_free_task perf_event_release_kernel
...
acquire event->child_mutex
...
get_ctx
... release event->child_mutex
acquire ctx->mutex
...
perf_free_event (acquire/release event->child_mutex)
...
release ctx->mutex
wait_var_event
acquire ctx->mutex
acquire event->child_mutex
# move existing events to free_list
release event->child_mutex
release ctx->mutex
put_ctx
... ...
In this case, all events of the ctx have been freed, so we couldn't
find the ctx in free_list and Thread A will miss the wakeup. It's thus
necessary to add a wakeup after dropping the reference.
Fixes: 1cf8dfe8a661 ("perf/core: Fix race between close() and fork()")
Signed-off-by: Haifeng Xu <haifeng.xu@shopee.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20240513103948.33570-1-haifeng.xu@shopee.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/events/core.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5353,6 +5353,7 @@ int perf_event_release_kernel(struct per
again:
mutex_lock(&event->child_mutex);
list_for_each_entry(child, &event->child_list, child_list) {
+ void *var = NULL;
/*
* Cannot change, child events are not migrated, see the
@@ -5393,11 +5394,23 @@ again:
* this can't be the last reference.
*/
put_event(event);
+ } else {
+ var = &ctx->refcount;
}
mutex_unlock(&event->child_mutex);
mutex_unlock(&ctx->mutex);
put_ctx(ctx);
+
+ if (var) {
+ /*
+ * If perf_event_free_task() has deleted all events from the
+ * ctx while the child_mutex got released above, make sure to
+ * notify about the preceding put_ctx().
+ */
+ smp_mb(); /* pairs with wait_var_event() */
+ wake_up_var(var);
+ }
goto again;
}
mutex_unlock(&event->child_mutex);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 193/267] perf auxtrace: Fix multiple use of --itrace option
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (191 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 192/267] perf/core: Fix missing wakeup when waiting for context reference Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 194/267] riscv: fix overlap of allocated page and PTR_ERR Greg Kroah-Hartman
` (85 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Adrian Hunter, Andi Kleen,
Ian Rogers, Jiri Olsa, Namhyung Kim, Arnaldo Carvalho de Melo
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Adrian Hunter <adrian.hunter@intel.com>
commit bb69c912c4e8005cf1ee6c63782d2fc28838dee2 upstream.
If the --itrace option is used more than once, the options are
combined, but "i" and "y" (sub-)options can be corrupted because
itrace_do_parse_synth_opts() incorrectly overwrites the period type and
period with default values.
For example, with:
--itrace=i0ns --itrace=e
The processing of "--itrace=e", resets the "i" period from 0 nanoseconds
to the default 100 microseconds.
Fix by performing the default setting of period type and period only if
"i" or "y" are present in the currently processed --itrace value.
Fixes: f6986c95af84ff2a ("perf session: Add instruction tracing options")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240315071334.3478-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
tools/perf/util/auxtrace.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1466,6 +1466,7 @@ int itrace_do_parse_synth_opts(struct it
char *endptr;
bool period_type_set = false;
bool period_set = false;
+ bool iy = false;
synth_opts->set = true;
@@ -1484,6 +1485,7 @@ int itrace_do_parse_synth_opts(struct it
switch (*p++) {
case 'i':
case 'y':
+ iy = true;
if (p[-1] == 'y')
synth_opts->cycles = true;
else
@@ -1646,7 +1648,7 @@ int itrace_do_parse_synth_opts(struct it
}
}
out:
- if (synth_opts->instructions || synth_opts->cycles) {
+ if (iy) {
if (!period_type_set)
synth_opts->period_type =
PERF_ITRACE_DEFAULT_PERIOD_TYPE;
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 194/267] riscv: fix overlap of allocated page and PTR_ERR
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (192 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 193/267] perf auxtrace: Fix multiple use of --itrace option Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 195/267] tracing/selftests: Fix kprobe event name test for .isra. functions Greg Kroah-Hartman
` (84 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Björn Töpel, Nam Cao,
Björn Töpel, Mike Rapoport (IBM), Palmer Dabbelt
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nam Cao <namcao@linutronix.de>
commit 994af1825a2aa286f4903ff64a1c7378b52defe6 upstream.
On riscv32, it is possible for the last page in virtual address space
(0xfffff000) to be allocated. This page overlaps with PTR_ERR, so that
shouldn't happen.
There is already some code to ensure memblock won't allocate the last page.
However, buddy allocator is left unchecked.
Fix this by reserving physical memory that would be mapped at virtual
addresses greater than 0xfffff000.
Reported-by: Björn Töpel <bjorn@kernel.org>
Closes: https://lore.kernel.org/linux-riscv/878r1ibpdn.fsf@all.your.base.are.belong.to.us
Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: <stable@vger.kernel.org>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Link: https://lore.kernel.org/r/20240425115201.3044202-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/riscv/mm/init.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -234,18 +234,19 @@ static void __init setup_bootmem(void)
kernel_map.va_pa_offset = PAGE_OFFSET - phys_ram_base;
/*
- * memblock allocator is not aware of the fact that last 4K bytes of
- * the addressable memory can not be mapped because of IS_ERR_VALUE
- * macro. Make sure that last 4k bytes are not usable by memblock
- * if end of dram is equal to maximum addressable memory. For 64-bit
- * kernel, this problem can't happen here as the end of the virtual
- * address space is occupied by the kernel mapping then this check must
- * be done as soon as the kernel mapping base address is determined.
+ * Reserve physical address space that would be mapped to virtual
+ * addresses greater than (void *)(-PAGE_SIZE) because:
+ * - This memory would overlap with ERR_PTR
+ * - This memory belongs to high memory, which is not supported
+ *
+ * This is not applicable to 64-bit kernel, because virtual addresses
+ * after (void *)(-PAGE_SIZE) are not linearly mapped: they are
+ * occupied by kernel mapping. Also it is unrealistic for high memory
+ * to exist on 64-bit platforms.
*/
if (!IS_ENABLED(CONFIG_64BIT)) {
- max_mapped_addr = __pa(~(ulong)0);
- if (max_mapped_addr == (phys_ram_end - 1))
- memblock_set_current_limit(max_mapped_addr - 4096);
+ max_mapped_addr = __va_to_pa_nodebug(-PAGE_SIZE);
+ memblock_reserve(max_mapped_addr, (phys_addr_t)-max_mapped_addr);
}
min_low_pfn = PFN_UP(phys_ram_base);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 195/267] tracing/selftests: Fix kprobe event name test for .isra. functions
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (193 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 194/267] riscv: fix overlap of allocated page and PTR_ERR Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 196/267] kheaders: explicitly define file modes for archived headers Greg Kroah-Hartman
` (83 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Steven Rostedt (Google),
Masami Hiramatsu (Google), Shuah Khan
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (Google) <rostedt@goodmis.org>
commit 23a4b108accc29a6125ed14de4a044689ffeda78 upstream.
The kprobe_eventname.tc test checks if a function with .isra. can have a
kprobe attached to it. It loops through the kallsyms file for all the
functions that have the .isra. name, and checks if it exists in the
available_filter_functions file, and if it does, it uses it to attach a
kprobe to it.
The issue is that kprobes can not attach to functions that are listed more
than once in available_filter_functions. With the latest kernel, the
function that is found is: rapl_event_update.isra.0
# grep rapl_event_update.isra.0 /sys/kernel/tracing/available_filter_functions
rapl_event_update.isra.0
rapl_event_update.isra.0
It is listed twice. This causes the attached kprobe to it to fail which in
turn fails the test. Instead of just picking the function function that is
found in available_filter_functions, pick the first one that is listed
only once in available_filter_functions.
Cc: stable@vger.kernel.org
Fixes: 604e3548236d ("selftests/ftrace: Select an existing function in kprobe_eventname test")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
tools/testing/selftests/ftrace/test.d/kprobe/kprobe_eventname.tc | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_eventname.tc
+++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_eventname.tc
@@ -30,7 +30,8 @@ find_dot_func() {
fi
grep " [tT] .*\.isra\..*" /proc/kallsyms | cut -f 3 -d " " | while read f; do
- if grep -s $f available_filter_functions; then
+ cnt=`grep -s $f available_filter_functions | wc -l`;
+ if [ $cnt -eq 1 ]; then
echo $f
break
fi
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 196/267] kheaders: explicitly define file modes for archived headers
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (194 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 195/267] tracing/selftests: Fix kprobe event name test for .isra. functions Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 197/267] null_blk: Print correct max open zones limit in null_init_zoned_dev() Greg Kroah-Hartman
` (82 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Matthias Maennich, Masahiro Yamada
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthias Maennich <maennich@google.com>
commit 3bd27a847a3a4827a948387cc8f0dbc9fa5931d5 upstream.
Build environments might be running with different umask settings
resulting in indeterministic file modes for the files contained in
kheaders.tar.xz. The file itself is served with 444, i.e. world
readable. Archive the files explicitly with 744,a+X to improve
reproducibility across build environments.
--mode=0444 is not suitable as directories need to be executable. Also,
444 makes it hard to delete all the readonly files after extraction.
Cc: stable@vger.kernel.org
Signed-off-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
| 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/gen_kheaders.sh
+++ b/kernel/gen_kheaders.sh
@@ -89,7 +89,7 @@ find $cpio_dir -type f -print0 |
# Create archive and try to normalize metadata for reproducibility.
tar "${KBUILD_BUILD_TIMESTAMP:+--mtime=$KBUILD_BUILD_TIMESTAMP}" \
- --owner=0 --group=0 --sort=name --numeric-owner \
+ --owner=0 --group=0 --sort=name --numeric-owner --mode=u=rw,go=r,a+X \
-I $XZ -cf $tarfile -C $cpio_dir/ . > /dev/null
echo $headers_md5 > kernel/kheaders.md5
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 197/267] null_blk: Print correct max open zones limit in null_init_zoned_dev()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (195 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 196/267] kheaders: explicitly define file modes for archived headers Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 198/267] sock_map: avoid race between sock_map_close and sk_psock_put Greg Kroah-Hartman
` (81 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Damien Le Moal, Niklas Cassel,
Jens Axboe
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Damien Le Moal <dlemoal@kernel.org>
commit 233e27b4d21c3e44eb863f03e566d3a22e81a7ae upstream.
When changing the maximum number of open zones, print that number
instead of the total number of zones.
Fixes: dc4d137ee3b7 ("null_blk: add support for max open/active zone limit for zoned devices")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Link: https://lore.kernel.org/r/20240528062852.437599-1-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/block/null_blk/zoned.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/block/null_blk/zoned.c
+++ b/drivers/block/null_blk/zoned.c
@@ -112,7 +112,7 @@ int null_init_zoned_dev(struct nullb_dev
if (dev->zone_max_active && dev->zone_max_open > dev->zone_max_active) {
dev->zone_max_open = dev->zone_max_active;
pr_info("changed the maximum number of open zones to %u\n",
- dev->nr_zones);
+ dev->zone_max_open);
} else if (dev->zone_max_open >= dev->nr_zones - dev->zone_nr_conv) {
dev->zone_max_open = 0;
pr_info("zone_max_open limit disabled, limit >= zone count\n");
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 198/267] sock_map: avoid race between sock_map_close and sk_psock_put
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (196 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 197/267] null_blk: Print correct max open zones limit in null_init_zoned_dev() Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 199/267] dma-buf: handle testing kthreads creation failure Greg Kroah-Hartman
` (80 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Paolo Abeni,
syzbot+07a2e4a1a57118ef7355, Thadeu Lima de Souza Cascardo,
Jakub Sitnicki
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
commit 4b4647add7d3c8530493f7247d11e257ee425bf0 upstream.
sk_psock_get will return NULL if the refcount of psock has gone to 0, which
will happen when the last call of sk_psock_put is done. However,
sk_psock_drop may not have finished yet, so the close callback will still
point to sock_map_close despite psock being NULL.
This can be reproduced with a thread deleting an element from the sock map,
while the second one creates a socket, adds it to the map and closes it.
That will trigger the WARN_ON_ONCE:
------------[ cut here ]------------
WARNING: CPU: 1 PID: 7220 at net/core/sock_map.c:1701 sock_map_close+0x2a2/0x2d0 net/core/sock_map.c:1701
Modules linked in:
CPU: 1 PID: 7220 Comm: syz-executor380 Not tainted 6.9.0-syzkaller-07726-g3c999d1ae3c7 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
RIP: 0010:sock_map_close+0x2a2/0x2d0 net/core/sock_map.c:1701
Code: df e8 92 29 88 f8 48 8b 1b 48 89 d8 48 c1 e8 03 42 80 3c 20 00 74 08 48 89 df e8 79 29 88 f8 4c 8b 23 eb 89 e8 4f 15 23 f8 90 <0f> 0b 90 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d e9 13 26 3d 02
RSP: 0018:ffffc9000441fda8 EFLAGS: 00010293
RAX: ffffffff89731ae1 RBX: ffffffff94b87540 RCX: ffff888029470000
RDX: 0000000000000000 RSI: ffffffff8bcab5c0 RDI: ffffffff8c1faba0
RBP: 0000000000000000 R08: ffffffff92f9b61f R09: 1ffffffff25f36c3
R10: dffffc0000000000 R11: fffffbfff25f36c4 R12: ffffffff89731840
R13: ffff88804b587000 R14: ffff88804b587000 R15: ffffffff89731870
FS: 000055555e080380(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000000207d4000 CR4: 0000000000350ef0
Call Trace:
<TASK>
unix_release+0x87/0xc0 net/unix/af_unix.c:1048
__sock_release net/socket.c:659 [inline]
sock_close+0xbe/0x240 net/socket.c:1421
__fput+0x42b/0x8a0 fs/file_table.c:422
__do_sys_close fs/open.c:1556 [inline]
__se_sys_close fs/open.c:1541 [inline]
__x64_sys_close+0x7f/0x110 fs/open.c:1541
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fb37d618070
Code: 00 00 48 c7 c2 b8 ff ff ff f7 d8 64 89 02 b8 ff ff ff ff eb d4 e8 10 2c 00 00 80 3d 31 f0 07 00 00 74 17 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 48 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 7c
RSP: 002b:00007ffcd4a525d8 EFLAGS: 00000202 ORIG_RAX: 0000000000000003
RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007fb37d618070
RDX: 0000000000000010 RSI: 00000000200001c0 RDI: 0000000000000004
RBP: 0000000000000000 R08: 0000000100000000 R09: 0000000100000000
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>
Use sk_psock, which will only check that the pointer is not been set to
NULL yet, which should only happen after the callbacks are restored. If,
then, a reference can still be gotten, we may call sk_psock_stop and cancel
psock->work.
As suggested by Paolo Abeni, reorder the condition so the control flow is
less convoluted.
After that change, the reproducer does not trigger the WARN_ON_ONCE
anymore.
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Reported-by: syzbot+07a2e4a1a57118ef7355@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=07a2e4a1a57118ef7355
Fixes: aadb2bb83ff7 ("sock_map: Fix a potential use-after-free in sock_map_close()")
Fixes: 5b4a79ba65a1 ("bpf, sockmap: Don't let sock_map_{close,destroy,unhash} call itself")
Cc: stable@vger.kernel.org
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Acked-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/r/20240524144702.1178377-1-cascardo@igalia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/core/sock_map.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -1639,19 +1639,23 @@ void sock_map_close(struct sock *sk, lon
lock_sock(sk);
rcu_read_lock();
- psock = sk_psock_get(sk);
- if (unlikely(!psock)) {
- rcu_read_unlock();
- release_sock(sk);
- saved_close = READ_ONCE(sk->sk_prot)->close;
- } else {
+ psock = sk_psock(sk);
+ if (likely(psock)) {
saved_close = psock->saved_close;
sock_map_remove_links(sk, psock);
+ psock = sk_psock_get(sk);
+ if (unlikely(!psock))
+ goto no_psock;
rcu_read_unlock();
sk_psock_stop(psock);
release_sock(sk);
cancel_delayed_work_sync(&psock->work);
sk_psock_put(sk, psock);
+ } else {
+ saved_close = READ_ONCE(sk->sk_prot)->close;
+no_psock:
+ rcu_read_unlock();
+ release_sock(sk);
}
/* Make sure we do not recurse. This is a bug.
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 199/267] dma-buf: handle testing kthreads creation failure
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (197 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 198/267] sock_map: avoid race between sock_map_close and sk_psock_put Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 200/267] vmci: prevent speculation leaks by sanitizing event in event_deliver() Greg Kroah-Hartman
` (79 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Fedor Pchelkin, T.J. Mercier,
Christian König
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fedor Pchelkin <pchelkin@ispras.ru>
commit 6cb05d89fd62a76a9b74bd16211fb0930e89fea8 upstream.
kthread creation may possibly fail inside race_signal_callback(). In
such a case stop the already started threads, put the already taken
references to them and return with error code.
Found by Linux Verification Center (linuxtesting.org).
Fixes: 2989f6451084 ("dma-buf: Add selftests for dma-fence")
Cc: stable@vger.kernel.org
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Reviewed-by: T.J. Mercier <tjmercier@google.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240522181308.841686-1-pchelkin@ispras.ru
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/dma-buf/st-dma-fence.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/drivers/dma-buf/st-dma-fence.c
+++ b/drivers/dma-buf/st-dma-fence.c
@@ -540,6 +540,12 @@ static int race_signal_callback(void *ar
t[i].before = pass;
t[i].task = kthread_run(thread_signal_callback, &t[i],
"dma-fence:%d", i);
+ if (IS_ERR(t[i].task)) {
+ ret = PTR_ERR(t[i].task);
+ while (--i >= 0)
+ kthread_stop_put(t[i].task);
+ return ret;
+ }
get_task_struct(t[i].task);
}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 200/267] vmci: prevent speculation leaks by sanitizing event in event_deliver()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (198 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 199/267] dma-buf: handle testing kthreads creation failure Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 201/267] spmi: hisi-spmi-controller: Do not override device identifier Greg Kroah-Hartman
` (78 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, stable, Hagar Gamal Halim Hemdan
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hagar Gamal Halim Hemdan <hagarhem@amazon.com>
commit 8003f00d895310d409b2bf9ef907c56b42a4e0f4 upstream.
Coverity spotted that event_msg is controlled by user-space,
event_msg->event_data.event is passed to event_deliver() and used
as an index without sanitization.
This change ensures that the event index is sanitized to mitigate any
possibility of speculative information leaks.
This bug was discovered and resolved using Coverity Static Analysis
Security Testing (SAST) by Synopsys, Inc.
Only compile tested, no access to HW.
Fixes: 1d990201f9bb ("VMCI: event handling implementation.")
Cc: stable <stable@kernel.org>
Signed-off-by: Hagar Gamal Halim Hemdan <hagarhem@amazon.com>
Link: https://lore.kernel.org/stable/20231127193533.46174-1-hagarhem%40amazon.com
Link: https://lore.kernel.org/r/20240430085916.4753-1-hagarhem@amazon.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/misc/vmw_vmci/vmci_event.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/misc/vmw_vmci/vmci_event.c
+++ b/drivers/misc/vmw_vmci/vmci_event.c
@@ -9,6 +9,7 @@
#include <linux/vmw_vmci_api.h>
#include <linux/list.h>
#include <linux/module.h>
+#include <linux/nospec.h>
#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/rculist.h>
@@ -86,9 +87,12 @@ static void event_deliver(struct vmci_ev
{
struct vmci_subscription *cur;
struct list_head *subscriber_list;
+ u32 sanitized_event, max_vmci_event;
rcu_read_lock();
- subscriber_list = &subscriber_array[event_msg->event_data.event];
+ max_vmci_event = ARRAY_SIZE(subscriber_array);
+ sanitized_event = array_index_nospec(event_msg->event_data.event, max_vmci_event);
+ subscriber_list = &subscriber_array[sanitized_event];
list_for_each_entry_rcu(cur, subscriber_list, node) {
cur->callback(cur->id, &event_msg->event_data,
cur->callback_data);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 201/267] spmi: hisi-spmi-controller: Do not override device identifier
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (199 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 200/267] vmci: prevent speculation leaks by sanitizing event in event_deliver() Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 202/267] knfsd: LOOKUP can return an illegal error value Greg Kroah-Hartman
` (77 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Vamshi Gajjela, Stephen Boyd
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vamshi Gajjela <vamshigajjela@google.com>
commit eda4923d78d634482227c0b189d9b7ca18824146 upstream.
'nr' member of struct spmi_controller, which serves as an identifier
for the controller/bus. This value is a dynamic ID assigned in
spmi_controller_alloc, and overriding it from the driver results in an
ida_free error "ida_free called for id=xx which is not allocated".
Signed-off-by: Vamshi Gajjela <vamshigajjela@google.com>
Fixes: 70f59c90c819 ("staging: spmi: add Hikey 970 SPMI controller driver")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240228185116.1269-1-vamshigajjela@google.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Link: https://lore.kernel.org/r/20240507210809.3479953-5-sboyd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/spmi/hisi-spmi-controller.c | 1 -
1 file changed, 1 deletion(-)
--- a/drivers/spmi/hisi-spmi-controller.c
+++ b/drivers/spmi/hisi-spmi-controller.c
@@ -303,7 +303,6 @@ static int spmi_controller_probe(struct
spin_lock_init(&spmi_controller->lock);
- ctrl->nr = spmi_controller->channel;
ctrl->dev.parent = pdev->dev.parent;
ctrl->dev.of_node = of_node_get(pdev->dev.of_node);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 202/267] knfsd: LOOKUP can return an illegal error value
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (200 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 201/267] spmi: hisi-spmi-controller: Do not override device identifier Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 203/267] fs/proc: fix softlockup in __read_vmcore Greg Kroah-Hartman
` (76 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Trond Myklebust, Chuck Lever
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust <trond.myklebust@hammerspace.com>
commit e221c45da3770962418fb30c27d941bbc70d595a upstream.
The 'NFS error' NFSERR_OPNOTSUPP is not described by any of the official
NFS related RFCs, but appears to have snuck into some older .x files for
NFSv2.
Either way, it is not in RFC1094, RFC1813 or any of the NFSv4 RFCs, so
should not be returned by the knfsd server, and particularly not by the
"LOOKUP" operation.
Instead, let's return NFSERR_STALE, which is more appropriate if the
filesystem encodes the filehandle as FILEID_INVALID.
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/nfsd/nfsfh.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/nfsd/nfsfh.c
+++ b/fs/nfsd/nfsfh.c
@@ -572,7 +572,7 @@ fh_compose(struct svc_fh *fhp, struct sv
_fh_update(fhp, exp, dentry);
if (fhp->fh_handle.fh_fileid_type == FILEID_INVALID) {
fh_put(fhp);
- return nfserr_opnotsupp;
+ return nfserr_stale;
}
return 0;
@@ -598,7 +598,7 @@ fh_update(struct svc_fh *fhp)
_fh_update(fhp, fhp->fh_export, dentry);
if (fhp->fh_handle.fh_fileid_type == FILEID_INVALID)
- return nfserr_opnotsupp;
+ return nfserr_stale;
return 0;
out_bad:
printk(KERN_ERR "fh_update: fh not verified!\n");
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 203/267] fs/proc: fix softlockup in __read_vmcore
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (201 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 202/267] knfsd: LOOKUP can return an illegal error value Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 204/267] ocfs2: use coarse time for new created files Greg Kroah-Hartman
` (75 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Rik van Riel, Baoquan He, Dave Young,
Vivek Goyal, Andrew Morton
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rik van Riel <riel@surriel.com>
commit 5cbcb62dddf5346077feb82b7b0c9254222d3445 upstream.
While taking a kernel core dump with makedumpfile on a larger system,
softlockup messages often appear.
While softlockup warnings can be harmless, they can also interfere with
things like RCU freeing memory, which can be problematic when the kdump
kexec image is configured with as little memory as possible.
Avoid the softlockup, and give things like work items and RCU a chance to
do their thing during __read_vmcore by adding a cond_resched.
Link: https://lkml.kernel.org/r/20240507091858.36ff767f@imladris.surriel.com
Signed-off-by: Rik van Riel <riel@surriel.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/proc/vmcore.c | 2 ++
1 file changed, 2 insertions(+)
--- a/fs/proc/vmcore.c
+++ b/fs/proc/vmcore.c
@@ -383,6 +383,8 @@ static ssize_t __read_vmcore(struct iov_
/* leave now if filled buffer already */
if (!iov_iter_count(iter))
return acc;
+
+ cond_resched();
}
list_for_each_entry(m, &vmcore_list, list) {
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 204/267] ocfs2: use coarse time for new created files
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (202 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 203/267] fs/proc: fix softlockup in __read_vmcore Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 205/267] ocfs2: fix races between hole punching and AIO+DIO Greg Kroah-Hartman
` (74 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Su Yue, Joseph Qi, Mark Fasheh,
Joel Becker, Junxiao Bi, Changwei Ge, Gang He, Jun Piao,
Andrew Morton
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Su Yue <glass.su@suse.com>
commit b8cb324277ee16f3eca3055b96fce4735a5a41c6 upstream.
The default atime related mount option is '-o realtime' which means file
atime should be updated if atime <= ctime or atime <= mtime. atime should
be updated in the following scenario, but it is not:
==========================================================
$ rm /mnt/testfile;
$ echo test > /mnt/testfile
$ stat -c "%X %Y %Z" /mnt/testfile
1711881646 1711881646 1711881646
$ sleep 5
$ cat /mnt/testfile > /dev/null
$ stat -c "%X %Y %Z" /mnt/testfile
1711881646 1711881646 1711881646
==========================================================
And the reason the atime in the test is not updated is that ocfs2 calls
ktime_get_real_ts64() in __ocfs2_mknod_locked during file creation. Then
inode_set_ctime_current() is called in inode_set_ctime_current() calls
ktime_get_coarse_real_ts64() to get current time.
ktime_get_real_ts64() is more accurate than ktime_get_coarse_real_ts64().
In my test box, I saw ctime set by ktime_get_coarse_real_ts64() is less
than ktime_get_real_ts64() even ctime is set later. The ctime of the new
inode is smaller than atime.
The call trace is like:
ocfs2_create
ocfs2_mknod
__ocfs2_mknod_locked
....
ktime_get_real_ts64 <------- set atime,ctime,mtime, more accurate
ocfs2_populate_inode
...
ocfs2_init_acl
ocfs2_acl_set_mode
inode_set_ctime_current
current_time
ktime_get_coarse_real_ts64 <-------less accurate
ocfs2_file_read_iter
ocfs2_inode_lock_atime
ocfs2_should_update_atime
atime <= ctime ? <-------- false, ctime < atime due to accuracy
So here call ktime_get_coarse_real_ts64 to set inode time coarser while
creating new files. It may lower the accuracy of file times. But it's
not a big deal since we already use coarse time in other places like
ocfs2_update_inode_atime and inode_set_ctime_current.
Link: https://lkml.kernel.org/r/20240408082041.20925-5-glass.su@suse.com
Fixes: c62c38f6b91b ("ocfs2: replace CURRENT_TIME macro")
Signed-off-by: Su Yue <glass.su@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/ocfs2/namei.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/ocfs2/namei.c
+++ b/fs/ocfs2/namei.c
@@ -566,7 +566,7 @@ static int __ocfs2_mknod_locked(struct i
fe->i_last_eb_blk = 0;
strcpy(fe->i_signature, OCFS2_INODE_SIGNATURE);
fe->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
- ktime_get_real_ts64(&ts);
+ ktime_get_coarse_real_ts64(&ts);
fe->i_atime = fe->i_ctime = fe->i_mtime =
cpu_to_le64(ts.tv_sec);
fe->i_mtime_nsec = fe->i_ctime_nsec = fe->i_atime_nsec =
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 205/267] ocfs2: fix races between hole punching and AIO+DIO
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (203 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 204/267] ocfs2: use coarse time for new created files Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 206/267] PCI: rockchip-ep: Remove wrong mask on subsys_vendor_id Greg Kroah-Hartman
` (73 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Su Yue, Joseph Qi, Changwei Ge,
Gang He, Joel Becker, Jun Piao, Junxiao Bi, Mark Fasheh,
Andrew Morton
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Su Yue <glass.su@suse.com>
commit 952b023f06a24b2ad6ba67304c4c84d45bea2f18 upstream.
After commit "ocfs2: return real error code in ocfs2_dio_wr_get_block",
fstests/generic/300 become from always failed to sometimes failed:
========================================================================
[ 473.293420 ] run fstests generic/300
[ 475.296983 ] JBD2: Ignoring recovery information on journal
[ 475.302473 ] ocfs2: Mounting device (253,1) on (node local, slot 0) with ordered data mode.
[ 494.290998 ] OCFS2: ERROR (device dm-1): ocfs2_change_extent_flag: Owner 5668 has an extent at cpos 78723 which can no longer be found
[ 494.291609 ] On-disk corruption discovered. Please run fsck.ocfs2 once the filesystem is unmounted.
[ 494.292018 ] OCFS2: File system is now read-only.
[ 494.292224 ] (kworker/19:11,2628,19):ocfs2_mark_extent_written:5272 ERROR: status = -30
[ 494.292602 ] (kworker/19:11,2628,19):ocfs2_dio_end_io_write:2374 ERROR: status = -3
fio: io_u error on file /mnt/scratch/racer: Read-only file system: write offset=460849152, buflen=131072
=========================================================================
In __blockdev_direct_IO, ocfs2_dio_wr_get_block is called to add unwritten
extents to a list. extents are also inserted into extent tree in
ocfs2_write_begin_nolock. Then another thread call fallocate to puch a
hole at one of the unwritten extent. The extent at cpos was removed by
ocfs2_remove_extent(). At end io worker thread, ocfs2_search_extent_list
found there is no such extent at the cpos.
T1 T2 T3
inode lock
...
insert extents
...
inode unlock
ocfs2_fallocate
__ocfs2_change_file_space
inode lock
lock ip_alloc_sem
ocfs2_remove_inode_range inode
ocfs2_remove_btree_range
ocfs2_remove_extent
^---remove the extent at cpos 78723
...
unlock ip_alloc_sem
inode unlock
ocfs2_dio_end_io
ocfs2_dio_end_io_write
lock ip_alloc_sem
ocfs2_mark_extent_written
ocfs2_change_extent_flag
ocfs2_search_extent_list
^---failed to find extent
...
unlock ip_alloc_sem
In most filesystems, fallocate is not compatible with racing with AIO+DIO,
so fix it by adding to wait for all dio before fallocate/punch_hole like
ext4.
Link: https://lkml.kernel.org/r/20240408082041.20925-3-glass.su@suse.com
Fixes: b25801038da5 ("ocfs2: Support xfs style space reservation ioctls")
Signed-off-by: Su Yue <glass.su@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/ocfs2/file.c | 2 ++
1 file changed, 2 insertions(+)
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -1934,6 +1934,8 @@ static int __ocfs2_change_file_space(str
inode_lock(inode);
+ /* Wait all existing dio workers, newcomers will block on i_rwsem */
+ inode_dio_wait(inode);
/*
* This prevents concurrent writes on other nodes
*/
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 206/267] PCI: rockchip-ep: Remove wrong mask on subsys_vendor_id
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (204 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 205/267] ocfs2: fix races between hole punching and AIO+DIO Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 207/267] dmaengine: axi-dmac: fix possible race in remove() Greg Kroah-Hartman
` (72 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Rick Wertenbroek,
Krzysztof Wilczyński, Bjorn Helgaas, Damien Le Moal
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rick Wertenbroek <rick.wertenbroek@gmail.com>
commit 2dba285caba53f309d6060fca911b43d63f41697 upstream.
Remove wrong mask on subsys_vendor_id. Both the Vendor ID and Subsystem
Vendor ID are u16 variables and are written to a u32 register of the
controller. The Subsystem Vendor ID was always 0 because the u16 value
was masked incorrectly with GENMASK(31,16) resulting in all lower 16
bits being set to 0 prior to the shift.
Remove both masks as they are unnecessary and set the register correctly
i.e., the lower 16-bits are the Vendor ID and the upper 16-bits are the
Subsystem Vendor ID.
This is documented in the RK3399 TRM section 17.6.7.1.17
[kwilczynski: removed unnecesary newline]
Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
Link: https://lore.kernel.org/linux-pci/20240403144508.489835-1-rick.wertenbroek@gmail.com
Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/pci/controller/pcie-rockchip-ep.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
--- a/drivers/pci/controller/pcie-rockchip-ep.c
+++ b/drivers/pci/controller/pcie-rockchip-ep.c
@@ -98,10 +98,8 @@ static int rockchip_pcie_ep_write_header
/* All functions share the same vendor ID with function 0 */
if (fn == 0) {
- u32 vid_regs = (hdr->vendorid & GENMASK(15, 0)) |
- (hdr->subsys_vendor_id & GENMASK(31, 16)) << 16;
-
- rockchip_pcie_write(rockchip, vid_regs,
+ rockchip_pcie_write(rockchip,
+ hdr->vendorid | hdr->subsys_vendor_id << 16,
PCIE_CORE_CONFIG_VENDOR);
}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 207/267] dmaengine: axi-dmac: fix possible race in remove()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (205 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 206/267] PCI: rockchip-ep: Remove wrong mask on subsys_vendor_id Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:55 ` [PATCH 6.6 208/267] remoteproc: k3-r5: Wait for core0 power-up before powering up core1 Greg Kroah-Hartman
` (71 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, stable, Nuno Sa, Vinod Koul
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nuno Sa <nuno.sa@analog.com>
commit 1bc31444209c8efae98cb78818131950d9a6f4d6 upstream.
We need to first free the IRQ before calling of_dma_controller_free().
Otherwise we could get an interrupt and schedule a tasklet while
removing the DMA controller.
Fixes: 0e3b67b348b8 ("dmaengine: Add support for the Analog Devices AXI-DMAC DMA controller")
Cc: stable@kernel.org
Signed-off-by: Nuno Sa <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20240328-axi-dmac-devm-probe-v3-1-523c0176df70@analog.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/dma/dma-axi-dmac.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/dma/dma-axi-dmac.c
+++ b/drivers/dma/dma-axi-dmac.c
@@ -1033,8 +1033,8 @@ static int axi_dmac_remove(struct platfo
{
struct axi_dmac *dmac = platform_get_drvdata(pdev);
- of_dma_controller_free(pdev->dev.of_node);
free_irq(dmac->irq, dmac);
+ of_dma_controller_free(pdev->dev.of_node);
tasklet_kill(&dmac->chan.vchan.task);
dma_async_device_unregister(&dmac->dma_dev);
clk_disable_unprepare(dmac->clk);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 208/267] remoteproc: k3-r5: Wait for core0 power-up before powering up core1
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (206 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 207/267] dmaengine: axi-dmac: fix possible race in remove() Greg Kroah-Hartman
@ 2024-06-19 12:55 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 209/267] remoteproc: k3-r5: Do not allow core1 to power up before core0 via sysfs Greg Kroah-Hartman
` (70 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:55 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Apurva Nandan, Beleswar Padhi,
Mathieu Poirier
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Apurva Nandan <a-nandan@ti.com>
commit 61f6f68447aba08aeaa97593af3a7d85a114891f upstream.
PSC controller has a limitation that it can only power-up the second core
when the first core is in ON state. Power-state for core0 should be equal
to or higher than core1, else the kernel is seen hanging during rproc
loading.
Make the powering up of cores sequential, by waiting for the current core
to power-up before proceeding to the next core, with a timeout of 2sec.
Add a wait queue event in k3_r5_cluster_rproc_init call, that will wait
for the current core to be released from reset before proceeding with the
next core.
Fixes: 6dedbd1d5443 ("remoteproc: k3-r5: Add a remoteproc driver for R5F subsystem")
Signed-off-by: Apurva Nandan <a-nandan@ti.com>
Signed-off-by: Beleswar Padhi <b-padhi@ti.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240430105307.1190615-2-b-padhi@ti.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/remoteproc/ti_k3_r5_remoteproc.c | 33 +++++++++++++++++++++++++++++++
1 file changed, 33 insertions(+)
--- a/drivers/remoteproc/ti_k3_r5_remoteproc.c
+++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c
@@ -103,12 +103,14 @@ struct k3_r5_soc_data {
* @dev: cached device pointer
* @mode: Mode to configure the Cluster - Split or LockStep
* @cores: list of R5 cores within the cluster
+ * @core_transition: wait queue to sync core state changes
* @soc_data: SoC-specific feature data for a R5FSS
*/
struct k3_r5_cluster {
struct device *dev;
enum cluster_mode mode;
struct list_head cores;
+ wait_queue_head_t core_transition;
const struct k3_r5_soc_data *soc_data;
};
@@ -128,6 +130,7 @@ struct k3_r5_cluster {
* @atcm_enable: flag to control ATCM enablement
* @btcm_enable: flag to control BTCM enablement
* @loczrama: flag to dictate which TCM is at device address 0x0
+ * @released_from_reset: flag to signal when core is out of reset
*/
struct k3_r5_core {
struct list_head elem;
@@ -144,6 +147,7 @@ struct k3_r5_core {
u32 atcm_enable;
u32 btcm_enable;
u32 loczrama;
+ bool released_from_reset;
};
/**
@@ -460,6 +464,8 @@ static int k3_r5_rproc_prepare(struct rp
ret);
return ret;
}
+ core->released_from_reset = true;
+ wake_up_interruptible(&cluster->core_transition);
/*
* Newer IP revisions like on J7200 SoCs support h/w auto-initialization
@@ -1140,6 +1146,12 @@ static int k3_r5_rproc_configure_mode(st
return ret;
}
+ /*
+ * Skip the waiting mechanism for sequential power-on of cores if the
+ * core has already been booted by another entity.
+ */
+ core->released_from_reset = c_state;
+
ret = ti_sci_proc_get_status(core->tsp, &boot_vec, &cfg, &ctrl,
&stat);
if (ret < 0) {
@@ -1280,6 +1292,26 @@ init_rmem:
cluster->mode == CLUSTER_MODE_SINGLECPU ||
cluster->mode == CLUSTER_MODE_SINGLECORE)
break;
+
+ /*
+ * R5 cores require to be powered on sequentially, core0
+ * should be in higher power state than core1 in a cluster
+ * So, wait for current core to power up before proceeding
+ * to next core and put timeout of 2sec for each core.
+ *
+ * This waiting mechanism is necessary because
+ * rproc_auto_boot_callback() for core1 can be called before
+ * core0 due to thread execution order.
+ */
+ ret = wait_event_interruptible_timeout(cluster->core_transition,
+ core->released_from_reset,
+ msecs_to_jiffies(2000));
+ if (ret <= 0) {
+ dev_err(dev,
+ "Timed out waiting for %s core to power up!\n",
+ rproc->name);
+ return ret;
+ }
}
return 0;
@@ -1709,6 +1741,7 @@ static int k3_r5_probe(struct platform_d
cluster->dev = dev;
cluster->soc_data = data;
INIT_LIST_HEAD(&cluster->cores);
+ init_waitqueue_head(&cluster->core_transition);
ret = of_property_read_u32(np, "ti,cluster-mode", &cluster->mode);
if (ret < 0 && ret != -EINVAL) {
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 209/267] remoteproc: k3-r5: Do not allow core1 to power up before core0 via sysfs
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (207 preceding siblings ...)
2024-06-19 12:55 ` [PATCH 6.6 208/267] remoteproc: k3-r5: Wait for core0 power-up before powering up core1 Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 210/267] iio: adc: axi-adc: make sure AXI clock is enabled Greg Kroah-Hartman
` (69 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Beleswar Padhi, Mathieu Poirier
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Beleswar Padhi <b-padhi@ti.com>
commit 3c8a9066d584f5010b6f4ba03bf6b19d28973d52 upstream.
PSC controller has a limitation that it can only power-up the second
core when the first core is in ON state. Power-state for core0 should be
equal to or higher than core1.
Therefore, prevent core1 from powering up before core0 during the start
process from sysfs. Similarly, prevent core0 from shutting down before
core1 has been shut down from sysfs.
Fixes: 6dedbd1d5443 ("remoteproc: k3-r5: Add a remoteproc driver for R5F subsystem")
Signed-off-by: Beleswar Padhi <b-padhi@ti.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240430105307.1190615-3-b-padhi@ti.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/remoteproc/ti_k3_r5_remoteproc.c | 23 +++++++++++++++++++++--
1 file changed, 21 insertions(+), 2 deletions(-)
--- a/drivers/remoteproc/ti_k3_r5_remoteproc.c
+++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c
@@ -548,7 +548,7 @@ static int k3_r5_rproc_start(struct rpro
struct k3_r5_rproc *kproc = rproc->priv;
struct k3_r5_cluster *cluster = kproc->cluster;
struct device *dev = kproc->dev;
- struct k3_r5_core *core;
+ struct k3_r5_core *core0, *core;
u32 boot_addr;
int ret;
@@ -574,6 +574,15 @@ static int k3_r5_rproc_start(struct rpro
goto unroll_core_run;
}
} else {
+ /* do not allow core 1 to start before core 0 */
+ core0 = list_first_entry(&cluster->cores, struct k3_r5_core,
+ elem);
+ if (core != core0 && core0->rproc->state == RPROC_OFFLINE) {
+ dev_err(dev, "%s: can not start core 1 before core 0\n",
+ __func__);
+ return -EPERM;
+ }
+
ret = k3_r5_core_run(core);
if (ret)
goto put_mbox;
@@ -619,7 +628,8 @@ static int k3_r5_rproc_stop(struct rproc
{
struct k3_r5_rproc *kproc = rproc->priv;
struct k3_r5_cluster *cluster = kproc->cluster;
- struct k3_r5_core *core = kproc->core;
+ struct device *dev = kproc->dev;
+ struct k3_r5_core *core1, *core = kproc->core;
int ret;
/* halt all applicable cores */
@@ -632,6 +642,15 @@ static int k3_r5_rproc_stop(struct rproc
}
}
} else {
+ /* do not allow core 0 to stop before core 1 */
+ core1 = list_last_entry(&cluster->cores, struct k3_r5_core,
+ elem);
+ if (core != core1 && core1->rproc->state != RPROC_OFFLINE) {
+ dev_err(dev, "%s: can not stop core 0 before core 1\n",
+ __func__);
+ return -EPERM;
+ }
+
ret = k3_r5_core_halt(core);
if (ret)
goto out;
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 210/267] iio: adc: axi-adc: make sure AXI clock is enabled
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (208 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 209/267] remoteproc: k3-r5: Do not allow core1 to power up before core0 via sysfs Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 211/267] iio: invensense: fix interrupt timestamp alignment Greg Kroah-Hartman
` (68 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Nuno Sa, Stable, Jonathan Cameron
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nuno Sa <nuno.sa@analog.com>
commit 80721776c5af6f6dce7d84ba8df063957aa425a2 upstream.
We can only access the IP core registers if the bus clock is enabled. As
such we need to get and enable it and not rely on anyone else to do it.
Note this clock is a very fundamental one that is typically enabled
pretty early during boot. Independently of that, we should really rely on
it to be enabled.
Fixes: ef04070692a2 ("iio: adc: adi-axi-adc: add support for AXI ADC IP core")
Signed-off-by: Nuno Sa <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20240426-ad9467-new-features-v2-4-6361fc3ba1cc@analog.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/iio/adc/adi-axi-adc.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/drivers/iio/adc/adi-axi-adc.c
+++ b/drivers/iio/adc/adi-axi-adc.c
@@ -175,6 +175,7 @@ static int adi_axi_adc_probe(struct plat
struct adi_axi_adc_state *st;
void __iomem *base;
unsigned int ver;
+ struct clk *clk;
int ret;
st = devm_kzalloc(&pdev->dev, sizeof(*st), GFP_KERNEL);
@@ -195,6 +196,10 @@ static int adi_axi_adc_probe(struct plat
if (!expected_ver)
return -ENODEV;
+ clk = devm_clk_get_enabled(&pdev->dev, NULL);
+ if (IS_ERR(clk))
+ return PTR_ERR(clk);
+
/*
* Force disable the core. Up to the frontend to enable us. And we can
* still read/write registers...
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 211/267] iio: invensense: fix interrupt timestamp alignment
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (209 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 210/267] iio: adc: axi-adc: make sure AXI clock is enabled Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 212/267] riscv: rewrite __kernel_map_pages() to fix sleeping in invalid context Greg Kroah-Hartman
` (67 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Jean-Baptiste Maneyrol,
Jonathan Cameron
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com>
commit 0340dc4c82590d8735c58cf904a8aa1173273ab5 upstream.
Restrict interrupt timestamp alignment for not overflowing max/min
period thresholds.
Fixes: 0ecc363ccea7 ("iio: make invensense timestamp module generic")
Cc: stable@vger.kernel.org
Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@tdk.com>
Link: https://lore.kernel.org/r/20240426135814.141837-1-inv.git-commit@tdk.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/iio/common/inv_sensors/inv_sensors_timestamp.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
--- a/drivers/iio/common/inv_sensors/inv_sensors_timestamp.c
+++ b/drivers/iio/common/inv_sensors/inv_sensors_timestamp.c
@@ -105,6 +105,9 @@ static bool inv_update_chip_period(struc
static void inv_align_timestamp_it(struct inv_sensors_timestamp *ts)
{
+ const int64_t period_min = ts->min_period * ts->mult;
+ const int64_t period_max = ts->max_period * ts->mult;
+ int64_t add_max, sub_max;
int64_t delta, jitter;
int64_t adjust;
@@ -112,11 +115,13 @@ static void inv_align_timestamp_it(struc
delta = ts->it.lo - ts->timestamp;
/* adjust timestamp while respecting jitter */
+ add_max = period_max - (int64_t)ts->period;
+ sub_max = period_min - (int64_t)ts->period;
jitter = INV_SENSORS_TIMESTAMP_JITTER((int64_t)ts->period, ts->chip.jitter);
if (delta > jitter)
- adjust = jitter;
+ adjust = add_max;
else if (delta < -jitter)
- adjust = -jitter;
+ adjust = sub_max;
else
adjust = 0;
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 212/267] riscv: rewrite __kernel_map_pages() to fix sleeping in invalid context
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (210 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 211/267] iio: invensense: fix interrupt timestamp alignment Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 213/267] rtla/timerlat: Simplify "no value" printing on top Greg Kroah-Hartman
` (66 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Nam Cao, Alexandre Ghiti,
Palmer Dabbelt
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nam Cao <namcao@linutronix.de>
commit fb1cf0878328fe75d47f0aed0a65b30126fcefc4 upstream.
__kernel_map_pages() is a debug function which clears the valid bit in page
table entry for deallocated pages to detect illegal memory accesses to
freed pages.
This function set/clear the valid bit using __set_memory(). __set_memory()
acquires init_mm's semaphore, and this operation may sleep. This is
problematic, because __kernel_map_pages() can be called in atomic context,
and thus is illegal to sleep. An example warning that this causes:
BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1578
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2, name: kthreadd
preempt_count: 2, expected: 0
CPU: 0 PID: 2 Comm: kthreadd Not tainted 6.9.0-g1d4c6d784ef6 #37
Hardware name: riscv-virtio,qemu (DT)
Call Trace:
[<ffffffff800060dc>] dump_backtrace+0x1c/0x24
[<ffffffff8091ef6e>] show_stack+0x2c/0x38
[<ffffffff8092baf8>] dump_stack_lvl+0x5a/0x72
[<ffffffff8092bb24>] dump_stack+0x14/0x1c
[<ffffffff8003b7ac>] __might_resched+0x104/0x10e
[<ffffffff8003b7f4>] __might_sleep+0x3e/0x62
[<ffffffff8093276a>] down_write+0x20/0x72
[<ffffffff8000cf00>] __set_memory+0x82/0x2fa
[<ffffffff8000d324>] __kernel_map_pages+0x5a/0xd4
[<ffffffff80196cca>] __alloc_pages_bulk+0x3b2/0x43a
[<ffffffff8018ee82>] __vmalloc_node_range+0x196/0x6ba
[<ffffffff80011904>] copy_process+0x72c/0x17ec
[<ffffffff80012ab4>] kernel_clone+0x60/0x2fe
[<ffffffff80012f62>] kernel_thread+0x82/0xa0
[<ffffffff8003552c>] kthreadd+0x14a/0x1be
[<ffffffff809357de>] ret_from_fork+0xe/0x1c
Rewrite this function with apply_to_existing_page_range(). It is fine to
not have any locking, because __kernel_map_pages() works with pages being
allocated/deallocated and those pages are not changed by anyone else in the
meantime.
Fixes: 5fde3db5eb02 ("riscv: add ARCH_SUPPORTS_DEBUG_PAGEALLOC support")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/1289ecba9606a19917bc12b6c27da8aa23e1e5ae.1715750938.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/riscv/mm/pageattr.c | 28 ++++++++++++++++++++++------
1 file changed, 22 insertions(+), 6 deletions(-)
--- a/arch/riscv/mm/pageattr.c
+++ b/arch/riscv/mm/pageattr.c
@@ -387,17 +387,33 @@ int set_direct_map_default_noflush(struc
}
#ifdef CONFIG_DEBUG_PAGEALLOC
+static int debug_pagealloc_set_page(pte_t *pte, unsigned long addr, void *data)
+{
+ int enable = *(int *)data;
+
+ unsigned long val = pte_val(ptep_get(pte));
+
+ if (enable)
+ val |= _PAGE_PRESENT;
+ else
+ val &= ~_PAGE_PRESENT;
+
+ set_pte(pte, __pte(val));
+
+ return 0;
+}
+
void __kernel_map_pages(struct page *page, int numpages, int enable)
{
if (!debug_pagealloc_enabled())
return;
- if (enable)
- __set_memory((unsigned long)page_address(page), numpages,
- __pgprot(_PAGE_PRESENT), __pgprot(0));
- else
- __set_memory((unsigned long)page_address(page), numpages,
- __pgprot(0), __pgprot(_PAGE_PRESENT));
+ unsigned long start = (unsigned long)page_address(page);
+ unsigned long size = PAGE_SIZE * numpages;
+
+ apply_to_existing_page_range(&init_mm, start, size, debug_pagealloc_set_page, &enable);
+
+ flush_tlb_kernel_range(start, start + size);
}
#endif
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 213/267] rtla/timerlat: Simplify "no value" printing on top
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (211 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 212/267] riscv: rewrite __kernel_map_pages() to fix sleeping in invalid context Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 214/267] rtla/auto-analysis: Replace \t with spaces Greg Kroah-Hartman
` (65 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Jonathan Corbet, Juri Lelli,
Daniel Bristot de Oliveira
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Bristot de Oliveira <bristot@kernel.org>
commit 5f0769331a965675cdfec97c09f3f6e875d7c246 upstream.
Instead of printing three times the same output, print it only once,
reducing lines and being sure that all no values have the same length.
It also fixes an extra '\n' when running the with kernel threads, like
here:
=============== %< ==============
Timer Latency
0 00:00:01 | IRQ Timer Latency (us) | Thread Timer Latency (us)
CPU COUNT | cur min avg max | cur min avg max
2 #0 | - - - - | 161 161 161 161
3 #0 | - - - - | 161 161 161 161
8 #1 | 54 54 54 54 | - - - -'\n'
---------------|----------------------------------------|---------------------------------------
ALL #1 e0 | 54 54 54 | 161 161 161
=============== %< ==============
This '\n' should have been removed with the user-space support that
added another '\n' if not running with kernel threads.
Link: https://lkml.kernel.org/r/0a4d8085e7cd706733a5dc10a81ca38b82bd4992.1713968967.git.bristot@kernel.org
Cc: stable@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juri Lelli <juri.lelli@redhat.com>
Fixes: cdca4f4e5e8e ("rtla/timerlat_top: Add timerlat user-space support")
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
tools/tracing/rtla/src/timerlat_top.c | 17 +++++------------
1 file changed, 5 insertions(+), 12 deletions(-)
--- a/tools/tracing/rtla/src/timerlat_top.c
+++ b/tools/tracing/rtla/src/timerlat_top.c
@@ -211,6 +211,8 @@ static void timerlat_top_header(struct o
trace_seq_printf(s, "\n");
}
+static const char *no_value = " -";
+
/*
* timerlat_top_print - prints the output of a given CPU
*/
@@ -238,10 +240,7 @@ static void timerlat_top_print(struct os
trace_seq_printf(s, "%3d #%-9d |", cpu, cpu_data->irq_count);
if (!cpu_data->irq_count) {
- trace_seq_printf(s, " - ");
- trace_seq_printf(s, " - ");
- trace_seq_printf(s, " - ");
- trace_seq_printf(s, " - |");
+ trace_seq_printf(s, "%s %s %s %s |", no_value, no_value, no_value, no_value);
} else {
trace_seq_printf(s, "%9llu ", cpu_data->cur_irq / params->output_divisor);
trace_seq_printf(s, "%9llu ", cpu_data->min_irq / params->output_divisor);
@@ -250,10 +249,7 @@ static void timerlat_top_print(struct os
}
if (!cpu_data->thread_count) {
- trace_seq_printf(s, " - ");
- trace_seq_printf(s, " - ");
- trace_seq_printf(s, " - ");
- trace_seq_printf(s, " -\n");
+ trace_seq_printf(s, "%s %s %s %s", no_value, no_value, no_value, no_value);
} else {
trace_seq_printf(s, "%9llu ", cpu_data->cur_thread / divisor);
trace_seq_printf(s, "%9llu ", cpu_data->min_thread / divisor);
@@ -270,10 +266,7 @@ static void timerlat_top_print(struct os
trace_seq_printf(s, " |");
if (!cpu_data->user_count) {
- trace_seq_printf(s, " - ");
- trace_seq_printf(s, " - ");
- trace_seq_printf(s, " - ");
- trace_seq_printf(s, " -\n");
+ trace_seq_printf(s, "%s %s %s %s\n", no_value, no_value, no_value, no_value);
} else {
trace_seq_printf(s, "%9llu ", cpu_data->cur_user / divisor);
trace_seq_printf(s, "%9llu ", cpu_data->min_user / divisor);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 214/267] rtla/auto-analysis: Replace \t with spaces
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (212 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 213/267] rtla/timerlat: Simplify "no value" printing on top Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 215/267] drm/i915/gt: Disarm breadcrumbs if engines are already idle Greg Kroah-Hartman
` (64 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Jonathan Corbet, Juri Lelli,
Daniel Bristot de Oliveira
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Bristot de Oliveira <bristot@kernel.org>
commit a40e5e4dd0207485dee75e2b8e860d5853bcc5f7 upstream.
When copying timerlat auto-analysis from a terminal to some web pages or
chats, the \t are being replaced with a single ' ' or ' ', breaking
the output.
For example:
## CPU 3 hit stop tracing, analyzing it ##
IRQ handler delay: 1.30 us (0.11 %)
IRQ latency: 1.90 us
Timerlat IRQ duration: 3.00 us (0.24 %)
Blocking thread: 1223.16 us (99.00 %)
insync:4048 1223.16 us
IRQ interference 4.93 us (0.40 %)
local_timer:236 4.93 us
------------------------------------------------------------------------
Thread latency: 1235.47 us (100%)
Replace \t with spaces to avoid this problem.
Link: https://lkml.kernel.org/r/ec7ed2b2809c22ab0dfc8eb7c805ab9cddc4254a.1713968967.git.bristot@kernel.org
Cc: stable@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Juri Lelli <juri.lelli@redhat.com>
Fixes: 27e348b221f6 ("rtla/timerlat: Add auto-analysis core")
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
tools/tracing/rtla/src/timerlat_aa.c | 111 ++++++++++++++++++++---------------
1 file changed, 64 insertions(+), 47 deletions(-)
--- a/tools/tracing/rtla/src/timerlat_aa.c
+++ b/tools/tracing/rtla/src/timerlat_aa.c
@@ -16,6 +16,9 @@ enum timelat_state {
TIMERLAT_WAITING_THREAD,
};
+/* Used to fill spaces in the output */
+static const char *spaces = " ";
+
#define MAX_COMM 24
/*
@@ -274,14 +277,17 @@ static int timerlat_aa_nmi_handler(struc
taa_data->prev_irq_timstamp = start;
trace_seq_reset(taa_data->prev_irqs_seq);
- trace_seq_printf(taa_data->prev_irqs_seq, "\t%24s \t\t\t%9.2f us\n",
- "nmi", ns_to_usf(duration));
+ trace_seq_printf(taa_data->prev_irqs_seq, " %24s %.*s %9.2f us\n",
+ "nmi",
+ 24, spaces,
+ ns_to_usf(duration));
return 0;
}
taa_data->thread_nmi_sum += duration;
- trace_seq_printf(taa_data->nmi_seq, " %24s \t\t\t%9.2f us\n",
- "nmi", ns_to_usf(duration));
+ trace_seq_printf(taa_data->nmi_seq, " %24s %.*s %9.2f us\n",
+ "nmi",
+ 24, spaces, ns_to_usf(duration));
return 0;
}
@@ -323,8 +329,10 @@ static int timerlat_aa_irq_handler(struc
taa_data->prev_irq_timstamp = start;
trace_seq_reset(taa_data->prev_irqs_seq);
- trace_seq_printf(taa_data->prev_irqs_seq, "\t%24s:%-3llu \t\t%9.2f us\n",
- desc, vector, ns_to_usf(duration));
+ trace_seq_printf(taa_data->prev_irqs_seq, " %24s:%-3llu %.*s %9.2f us\n",
+ desc, vector,
+ 15, spaces,
+ ns_to_usf(duration));
return 0;
}
@@ -372,8 +380,10 @@ static int timerlat_aa_irq_handler(struc
* IRQ interference.
*/
taa_data->thread_irq_sum += duration;
- trace_seq_printf(taa_data->irqs_seq, " %24s:%-3llu \t %9.2f us\n",
- desc, vector, ns_to_usf(duration));
+ trace_seq_printf(taa_data->irqs_seq, " %24s:%-3llu %.*s %9.2f us\n",
+ desc, vector,
+ 24, spaces,
+ ns_to_usf(duration));
return 0;
}
@@ -408,8 +418,10 @@ static int timerlat_aa_softirq_handler(s
taa_data->thread_softirq_sum += duration;
- trace_seq_printf(taa_data->softirqs_seq, "\t%24s:%-3llu \t %9.2f us\n",
- softirq_name[vector], vector, ns_to_usf(duration));
+ trace_seq_printf(taa_data->softirqs_seq, " %24s:%-3llu %.*s %9.2f us\n",
+ softirq_name[vector], vector,
+ 24, spaces,
+ ns_to_usf(duration));
return 0;
}
@@ -452,8 +464,10 @@ static int timerlat_aa_thread_handler(st
} else {
taa_data->thread_thread_sum += duration;
- trace_seq_printf(taa_data->threads_seq, "\t%24s:%-3llu \t\t%9.2f us\n",
- comm, pid, ns_to_usf(duration));
+ trace_seq_printf(taa_data->threads_seq, " %24s:%-12llu %.*s %9.2f us\n",
+ comm, pid,
+ 15, spaces,
+ ns_to_usf(duration));
}
return 0;
@@ -482,7 +496,8 @@ static int timerlat_aa_stack_handler(str
function = tep_find_function(taa_ctx->tool->trace.tep, caller[i]);
if (!function)
break;
- trace_seq_printf(taa_data->stack_seq, "\t\t-> %s\n", function);
+ trace_seq_printf(taa_data->stack_seq, " %.*s -> %s\n",
+ 14, spaces, function);
}
}
return 0;
@@ -568,23 +583,24 @@ static void timerlat_thread_analysis(str
exp_irq_ts = taa_data->timer_irq_start_time - taa_data->timer_irq_start_delay;
if (exp_irq_ts < taa_data->prev_irq_timstamp + taa_data->prev_irq_duration) {
if (taa_data->prev_irq_timstamp < taa_data->timer_irq_start_time)
- printf(" Previous IRQ interference: \t\t up to %9.2f us\n",
- ns_to_usf(taa_data->prev_irq_duration));
+ printf(" Previous IRQ interference: %.*s up to %9.2f us\n",
+ 16, spaces,
+ ns_to_usf(taa_data->prev_irq_duration));
}
/*
* The delay that the IRQ suffered before starting.
*/
- printf(" IRQ handler delay: %16s %9.2f us (%.2f %%)\n",
- (ns_to_usf(taa_data->timer_exit_from_idle) > 10) ? "(exit from idle)" : "",
- ns_to_usf(taa_data->timer_irq_start_delay),
- ns_to_per(total, taa_data->timer_irq_start_delay));
+ printf(" IRQ handler delay: %.*s %16s %9.2f us (%.2f %%)\n", 16, spaces,
+ (ns_to_usf(taa_data->timer_exit_from_idle) > 10) ? "(exit from idle)" : "",
+ ns_to_usf(taa_data->timer_irq_start_delay),
+ ns_to_per(total, taa_data->timer_irq_start_delay));
/*
* Timerlat IRQ.
*/
- printf(" IRQ latency: \t\t\t\t %9.2f us\n",
- ns_to_usf(taa_data->tlat_irq_latency));
+ printf(" IRQ latency: %.*s %9.2f us\n", 40, spaces,
+ ns_to_usf(taa_data->tlat_irq_latency));
if (irq) {
/*
@@ -595,15 +611,16 @@ static void timerlat_thread_analysis(str
* so it will be displayed, it is the key.
*/
printf(" Blocking thread:\n");
- printf(" %24s:%-9llu\n",
- taa_data->run_thread_comm, taa_data->run_thread_pid);
+ printf(" %.*s %24s:%-9llu\n", 6, spaces, taa_data->run_thread_comm,
+ taa_data->run_thread_pid);
} else {
/*
* The duration of the IRQ handler that handled the timerlat IRQ.
*/
- printf(" Timerlat IRQ duration: \t\t %9.2f us (%.2f %%)\n",
- ns_to_usf(taa_data->timer_irq_duration),
- ns_to_per(total, taa_data->timer_irq_duration));
+ printf(" Timerlat IRQ duration: %.*s %9.2f us (%.2f %%)\n",
+ 30, spaces,
+ ns_to_usf(taa_data->timer_irq_duration),
+ ns_to_per(total, taa_data->timer_irq_duration));
/*
* The amount of time that the current thread postponed the scheduler.
@@ -611,13 +628,13 @@ static void timerlat_thread_analysis(str
* Recalling that it is net from NMI/IRQ/Softirq interference, so there
* is no need to compute values here.
*/
- printf(" Blocking thread: \t\t\t %9.2f us (%.2f %%)\n",
- ns_to_usf(taa_data->thread_blocking_duration),
- ns_to_per(total, taa_data->thread_blocking_duration));
-
- printf(" %24s:%-9llu %9.2f us\n",
- taa_data->run_thread_comm, taa_data->run_thread_pid,
- ns_to_usf(taa_data->thread_blocking_duration));
+ printf(" Blocking thread: %.*s %9.2f us (%.2f %%)\n", 36, spaces,
+ ns_to_usf(taa_data->thread_blocking_duration),
+ ns_to_per(total, taa_data->thread_blocking_duration));
+
+ printf(" %.*s %24s:%-9llu %.*s %9.2f us\n", 6, spaces,
+ taa_data->run_thread_comm, taa_data->run_thread_pid,
+ 12, spaces, ns_to_usf(taa_data->thread_blocking_duration));
}
/*
@@ -629,9 +646,9 @@ static void timerlat_thread_analysis(str
* NMIs can happen during the IRQ, so they are always possible.
*/
if (taa_data->thread_nmi_sum)
- printf(" NMI interference \t\t\t %9.2f us (%.2f %%)\n",
- ns_to_usf(taa_data->thread_nmi_sum),
- ns_to_per(total, taa_data->thread_nmi_sum));
+ printf(" NMI interference %.*s %9.2f us (%.2f %%)\n", 36, spaces,
+ ns_to_usf(taa_data->thread_nmi_sum),
+ ns_to_per(total, taa_data->thread_nmi_sum));
/*
* If it is an IRQ latency, the other factors can be skipped.
@@ -643,9 +660,9 @@ static void timerlat_thread_analysis(str
* Prints the interference caused by IRQs to the thread latency.
*/
if (taa_data->thread_irq_sum) {
- printf(" IRQ interference \t\t\t %9.2f us (%.2f %%)\n",
- ns_to_usf(taa_data->thread_irq_sum),
- ns_to_per(total, taa_data->thread_irq_sum));
+ printf(" IRQ interference %.*s %9.2f us (%.2f %%)\n", 36, spaces,
+ ns_to_usf(taa_data->thread_irq_sum),
+ ns_to_per(total, taa_data->thread_irq_sum));
trace_seq_do_printf(taa_data->irqs_seq);
}
@@ -654,9 +671,9 @@ static void timerlat_thread_analysis(str
* Prints the interference caused by Softirqs to the thread latency.
*/
if (taa_data->thread_softirq_sum) {
- printf(" Softirq interference \t\t\t %9.2f us (%.2f %%)\n",
- ns_to_usf(taa_data->thread_softirq_sum),
- ns_to_per(total, taa_data->thread_softirq_sum));
+ printf(" Softirq interference %.*s %9.2f us (%.2f %%)\n", 32, spaces,
+ ns_to_usf(taa_data->thread_softirq_sum),
+ ns_to_per(total, taa_data->thread_softirq_sum));
trace_seq_do_printf(taa_data->softirqs_seq);
}
@@ -670,9 +687,9 @@ static void timerlat_thread_analysis(str
* timer handling latency.
*/
if (taa_data->thread_thread_sum) {
- printf(" Thread interference \t\t\t %9.2f us (%.2f %%)\n",
- ns_to_usf(taa_data->thread_thread_sum),
- ns_to_per(total, taa_data->thread_thread_sum));
+ printf(" Thread interference %.*s %9.2f us (%.2f %%)\n", 33, spaces,
+ ns_to_usf(taa_data->thread_thread_sum),
+ ns_to_per(total, taa_data->thread_thread_sum));
trace_seq_do_printf(taa_data->threads_seq);
}
@@ -682,8 +699,8 @@ static void timerlat_thread_analysis(str
*/
print_total:
printf("------------------------------------------------------------------------\n");
- printf(" %s latency: \t\t\t %9.2f us (100%%)\n", irq ? "IRQ" : "Thread",
- ns_to_usf(total));
+ printf(" %s latency: %.*s %9.2f us (100%%)\n", irq ? " IRQ" : "Thread",
+ 37, spaces, ns_to_usf(total));
}
static int timerlat_auto_analysis_collect_trace(struct timerlat_aa_context *taa_ctx)
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 215/267] drm/i915/gt: Disarm breadcrumbs if engines are already idle
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (213 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 214/267] rtla/auto-analysis: Replace \t with spaces Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 216/267] drm/shmem-helper: Fix BUG_ON() on mmap(PROT_WRITE, MAP_PRIVATE) Greg Kroah-Hartman
` (63 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Chris Wilson, Andrzej Hajda,
Janusz Krzysztofik, Nirmoy Das, Andi Shyti, Jani Nikula
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chris Wilson <chris@chris-wilson.co.uk>
commit 70cb9188ffc75e643debf292fcddff36c9dbd4ae upstream.
The breadcrumbs use a GT wakeref for guarding the interrupt, but are
disarmed during release of the engine wakeref. This leaves a hole where
we may attach a breadcrumb just as the engine is parking (after it has
parked its breadcrumbs), execute the irq worker with some signalers still
attached, but never be woken again.
That issue manifests itself in CI with IGT runner timeouts while tests
are waiting indefinitely for release of all GT wakerefs.
<6> [209.151778] i915: Running live_engine_pm_selftests/live_engine_busy_stats
<7> [209.231628] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_5
<7> [209.231816] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_4
<7> [209.231944] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_3
<7> [209.232056] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling PW_2
<7> [209.232166] i915 0000:00:02.0: [drm:intel_power_well_disable [i915]] disabling DC_off
<7> [209.232270] i915 0000:00:02.0: [drm:skl_enable_dc6 [i915]] Enabling DC6
<7> [209.232368] i915 0000:00:02.0: [drm:gen9_set_dc_state.part.0 [i915]] Setting DC state from 00 to 02
<4> [299.356116] [IGT] Inactivity timeout exceeded. Killing the current test with SIGQUIT.
...
<6> [299.356526] sysrq: Show State
...
<6> [299.373964] task:i915_selftest state:D stack:11784 pid:5578 tgid:5578 ppid:873 flags:0x00004002
<6> [299.373967] Call Trace:
<6> [299.373968] <TASK>
<6> [299.373970] __schedule+0x3bb/0xda0
<6> [299.373974] schedule+0x41/0x110
<6> [299.373976] intel_wakeref_wait_for_idle+0x82/0x100 [i915]
<6> [299.374083] ? __pfx_var_wake_function+0x10/0x10
<6> [299.374087] live_engine_busy_stats+0x9b/0x500 [i915]
<6> [299.374173] __i915_subtests+0xbe/0x240 [i915]
<6> [299.374277] ? __pfx___intel_gt_live_setup+0x10/0x10 [i915]
<6> [299.374369] ? __pfx___intel_gt_live_teardown+0x10/0x10 [i915]
<6> [299.374456] intel_engine_live_selftests+0x1c/0x30 [i915]
<6> [299.374547] __run_selftests+0xbb/0x190 [i915]
<6> [299.374635] i915_live_selftests+0x4b/0x90 [i915]
<6> [299.374717] i915_pci_probe+0x10d/0x210 [i915]
At the end of the interrupt worker, if there are no more engines awake,
disarm the breadcrumb and go to sleep.
Fixes: 9d5612ca165a ("drm/i915/gt: Defer enabling the breadcrumb interrupt to after submission")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/10026
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: <stable@vger.kernel.org> # v5.12+
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240423165505.465734-2-janusz.krzysztofik@linux.intel.com
(cherry picked from commit fbad43eccae5cb14594195c20113369aabaa22b5)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -258,8 +258,13 @@ static void signal_irq_work(struct irq_w
i915_request_put(rq);
}
+ /* Lazy irq enabling after HW submission */
if (!READ_ONCE(b->irq_armed) && !list_empty(&b->signalers))
intel_breadcrumbs_arm_irq(b);
+
+ /* And confirm that we still want irqs enabled before we yield */
+ if (READ_ONCE(b->irq_armed) && !atomic_read(&b->active))
+ intel_breadcrumbs_disarm_irq(b);
}
struct intel_breadcrumbs *
@@ -310,13 +315,7 @@ void __intel_breadcrumbs_park(struct int
return;
/* Kick the work once more to drain the signalers, and disarm the irq */
- irq_work_sync(&b->irq_work);
- while (READ_ONCE(b->irq_armed) && !atomic_read(&b->active)) {
- local_irq_disable();
- signal_irq_work(&b->irq_work);
- local_irq_enable();
- cond_resched();
- }
+ irq_work_queue(&b->irq_work);
}
void intel_breadcrumbs_free(struct kref *kref)
@@ -399,7 +398,7 @@ static void insert_breadcrumb(struct i91
* the request as it may have completed and raised the interrupt as
* we were attaching it into the lists.
*/
- if (!b->irq_armed || __i915_request_is_complete(rq))
+ if (!READ_ONCE(b->irq_armed) || __i915_request_is_complete(rq))
irq_work_queue(&b->irq_work);
}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 216/267] drm/shmem-helper: Fix BUG_ON() on mmap(PROT_WRITE, MAP_PRIVATE)
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (214 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 215/267] drm/i915/gt: Disarm breadcrumbs if engines are already idle Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 217/267] drm/i915/dpt: Make DPT object unshrinkable Greg Kroah-Hartman
` (62 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Noralf Trønnes, Eric Anholt,
Rob Herring, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
David Airlie, Daniel Vetter, dri-devel, Wachowski, Karol,
Jacek Lawrynowicz, Daniel Vetter
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wachowski, Karol <karol.wachowski@intel.com>
commit 39bc27bd688066a63e56f7f64ad34fae03fbe3b8 upstream.
Lack of check for copy-on-write (COW) mapping in drm_gem_shmem_mmap
allows users to call mmap with PROT_WRITE and MAP_PRIVATE flag
causing a kernel panic due to BUG_ON in vmf_insert_pfn_prot:
BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags));
Return -EINVAL early if COW mapping is detected.
This bug affects all drm drivers using default shmem helpers.
It can be reproduced by this simple example:
void *ptr = mmap(0, size, PROT_WRITE, MAP_PRIVATE, fd, mmap_offset);
ptr[0] = 0;
Fixes: 2194a63a818d ("drm: Add library for shmem backed GEM objects")
Cc: Noralf Trønnes <noralf@tronnes.org>
Cc: Eric Anholt <eric@anholt.net>
Cc: Rob Herring <robh@kernel.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v5.2+
Signed-off-by: Wachowski, Karol <karol.wachowski@intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20240520100514.925681-1-jacek.lawrynowicz@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 3 +++
1 file changed, 3 insertions(+)
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -610,6 +610,9 @@ int drm_gem_shmem_mmap(struct drm_gem_sh
return ret;
}
+ if (is_cow_mapping(vma->vm_flags))
+ return -EINVAL;
+
dma_resv_lock(shmem->base.resv, NULL);
ret = drm_gem_shmem_get_pages(shmem);
dma_resv_unlock(shmem->base.resv);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 217/267] drm/i915/dpt: Make DPT object unshrinkable
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (215 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 216/267] drm/shmem-helper: Fix BUG_ON() on mmap(PROT_WRITE, MAP_PRIVATE) Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 218/267] drm/i915: Fix audio component initialization Greg Kroah-Hartman
` (61 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Shawn Lee, Vidya Srinivas,
Ville Syrjälä, Jani Nikula
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vidya Srinivas <vidya.srinivas@intel.com>
commit 43e2b37e2ab660c3565d4cff27922bc70e79c3f1 upstream.
In some scenarios, the DPT object gets shrunk but
the actual framebuffer did not and thus its still
there on the DPT's vm->bound_list. Then it tries to
rewrite the PTEs via a stale CPU mapping. This causes panic.
Cc: stable@vger.kernel.org
Reported-by: Shawn Lee <shawn.c.lee@intel.com>
Fixes: 0dc987b699ce ("drm/i915/display: Add smem fallback allocation for dpt")
Signed-off-by: Vidya Srinivas <vidya.srinivas@intel.com>
[vsyrjala: Add TODO comment]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240520165634.1162470-1-vidya.srinivas@intel.com
(cherry picked from commit 51064d471c53dcc8eddd2333c3f1c1d9131ba36c)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/gpu/drm/i915/gem/i915_gem_object.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -285,7 +285,9 @@ bool i915_gem_object_has_iomem(const str
static inline bool
i915_gem_object_is_shrinkable(const struct drm_i915_gem_object *obj)
{
- return i915_gem_object_type_has(obj, I915_GEM_OBJECT_IS_SHRINKABLE);
+ /* TODO: make DPT shrinkable when it has no bound vmas */
+ return i915_gem_object_type_has(obj, I915_GEM_OBJECT_IS_SHRINKABLE) &&
+ !obj->is_dpt;
}
static inline bool
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 218/267] drm/i915: Fix audio component initialization
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (216 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 217/267] drm/i915/dpt: Make DPT object unshrinkable Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 219/267] intel_th: pci: Add Granite Rapids support Greg Kroah-Hartman
` (60 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Imre Deak, Jani Nikula
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Imre Deak <imre.deak@intel.com>
commit 75800e2e4203ea83bbc9d4f63ad97ea582244a08 upstream.
After registering the audio component in i915_audio_component_init()
the audio driver may call i915_audio_component_get_power() via the
component ops. This could program AUD_FREQ_CNTRL with an uninitialized
value if the latter function is called before display.audio.freq_cntrl
gets initialized. The get_power() function also does a modeset which in
the above case happens too early before the initialization step and
triggers the
"Reject display access from task"
error message added by the Fixes: commit below.
Fix the above issue by registering the audio component only after the
initialization step.
Fixes: 87c1694533c9 ("drm/i915: save AUD_FREQ_CNTRL state at audio domain suspend")
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10291
Cc: stable@vger.kernel.org # v5.5+
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240521143022.3784539-1-imre.deak@intel.com
(cherry picked from commit fdd0b80172758ce284f19fa8a26d90c61e4371d2)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/gpu/drm/i915/display/intel_audio.c | 32 +++++++++++++-------
drivers/gpu/drm/i915/display/intel_audio.h | 1
drivers/gpu/drm/i915/display/intel_display_driver.c | 2 +
3 files changed, 24 insertions(+), 11 deletions(-)
--- a/drivers/gpu/drm/i915/display/intel_audio.c
+++ b/drivers/gpu/drm/i915/display/intel_audio.c
@@ -1251,17 +1251,6 @@ static const struct component_ops i915_a
static void i915_audio_component_init(struct drm_i915_private *i915)
{
u32 aud_freq, aud_freq_init;
- int ret;
-
- ret = component_add_typed(i915->drm.dev,
- &i915_audio_component_bind_ops,
- I915_COMPONENT_AUDIO);
- if (ret < 0) {
- drm_err(&i915->drm,
- "failed to add audio component (%d)\n", ret);
- /* continue with reduced functionality */
- return;
- }
if (DISPLAY_VER(i915) >= 9) {
aud_freq_init = intel_de_read(i915, AUD_FREQ_CNTRL);
@@ -1284,6 +1273,21 @@ static void i915_audio_component_init(st
/* init with current cdclk */
intel_audio_cdclk_change_post(i915);
+}
+
+static void i915_audio_component_register(struct drm_i915_private *i915)
+{
+ int ret;
+
+ ret = component_add_typed(i915->drm.dev,
+ &i915_audio_component_bind_ops,
+ I915_COMPONENT_AUDIO);
+ if (ret < 0) {
+ drm_err(&i915->drm,
+ "failed to add audio component (%d)\n", ret);
+ /* continue with reduced functionality */
+ return;
+ }
i915->display.audio.component_registered = true;
}
@@ -1316,6 +1320,12 @@ void intel_audio_init(struct drm_i915_pr
i915_audio_component_init(i915);
}
+void intel_audio_register(struct drm_i915_private *i915)
+{
+ if (!i915->display.audio.lpe.platdev)
+ i915_audio_component_register(i915);
+}
+
/**
* intel_audio_deinit() - deinitialize the audio driver
* @i915: the i915 drm device private data
--- a/drivers/gpu/drm/i915/display/intel_audio.h
+++ b/drivers/gpu/drm/i915/display/intel_audio.h
@@ -28,6 +28,7 @@ void intel_audio_codec_get_config(struct
void intel_audio_cdclk_change_pre(struct drm_i915_private *dev_priv);
void intel_audio_cdclk_change_post(struct drm_i915_private *dev_priv);
void intel_audio_init(struct drm_i915_private *dev_priv);
+void intel_audio_register(struct drm_i915_private *i915);
void intel_audio_deinit(struct drm_i915_private *dev_priv);
void intel_audio_sdp_split_update(struct intel_encoder *encoder,
const struct intel_crtc_state *crtc_state);
--- a/drivers/gpu/drm/i915/display/intel_display_driver.c
+++ b/drivers/gpu/drm/i915/display/intel_display_driver.c
@@ -386,6 +386,8 @@ void intel_display_driver_register(struc
intel_audio_init(i915);
+ intel_audio_register(i915);
+
intel_display_debugfs_register(i915);
/*
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 219/267] intel_th: pci: Add Granite Rapids support
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (217 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 218/267] drm/i915: Fix audio component initialization Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 220/267] intel_th: pci: Add Granite Rapids SOC support Greg Kroah-Hartman
` (59 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Alexander Shishkin, Andy Shevchenko,
stable
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
commit e44937889bdf4ecd1f0c25762b7226406b9b7a69 upstream.
Add support for the Trace Hub in Granite Rapids.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20240429130119.1518073-11-alexander.shishkin@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/hwtracing/intel_th/pci.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/drivers/hwtracing/intel_th/pci.c
+++ b/drivers/hwtracing/intel_th/pci.c
@@ -305,6 +305,11 @@ static const struct pci_device_id intel_
.driver_data = (kernel_ulong_t)&intel_th_2x,
},
{
+ /* Granite Rapids */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x0963),
+ .driver_data = (kernel_ulong_t)&intel_th_2x,
+ },
+ {
/* Alder Lake CPU */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x466f),
.driver_data = (kernel_ulong_t)&intel_th_2x,
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 220/267] intel_th: pci: Add Granite Rapids SOC support
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (218 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 219/267] intel_th: pci: Add Granite Rapids support Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 221/267] intel_th: pci: Add Sapphire " Greg Kroah-Hartman
` (58 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Alexander Shishkin, Andy Shevchenko,
stable
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
commit 854afe461b009801a171b3a49c5f75ea43e4c04c upstream.
Add support for the Trace Hub in Granite Rapids SOC.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20240429130119.1518073-12-alexander.shishkin@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/hwtracing/intel_th/pci.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/drivers/hwtracing/intel_th/pci.c
+++ b/drivers/hwtracing/intel_th/pci.c
@@ -310,6 +310,11 @@ static const struct pci_device_id intel_
.driver_data = (kernel_ulong_t)&intel_th_2x,
},
{
+ /* Granite Rapids SOC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x3256),
+ .driver_data = (kernel_ulong_t)&intel_th_2x,
+ },
+ {
/* Alder Lake CPU */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x466f),
.driver_data = (kernel_ulong_t)&intel_th_2x,
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 221/267] intel_th: pci: Add Sapphire Rapids SOC support
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (219 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 220/267] intel_th: pci: Add Granite Rapids SOC support Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 222/267] intel_th: pci: Add Meteor Lake-S support Greg Kroah-Hartman
` (57 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Alexander Shishkin, Andy Shevchenko,
stable
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
commit 2e1da7efabe05cb0cf0b358883b2bc89080ed0eb upstream.
Add support for the Trace Hub in Sapphire Rapids SOC.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20240429130119.1518073-13-alexander.shishkin@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/hwtracing/intel_th/pci.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/drivers/hwtracing/intel_th/pci.c
+++ b/drivers/hwtracing/intel_th/pci.c
@@ -315,6 +315,11 @@ static const struct pci_device_id intel_
.driver_data = (kernel_ulong_t)&intel_th_2x,
},
{
+ /* Sapphire Rapids SOC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x3456),
+ .driver_data = (kernel_ulong_t)&intel_th_2x,
+ },
+ {
/* Alder Lake CPU */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x466f),
.driver_data = (kernel_ulong_t)&intel_th_2x,
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 222/267] intel_th: pci: Add Meteor Lake-S support
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (220 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 221/267] intel_th: pci: Add Sapphire " Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 223/267] intel_th: pci: Add Lunar Lake support Greg Kroah-Hartman
` (56 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Alexander Shishkin, Andy Shevchenko,
stable
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
commit c4a30def564d75e84718b059d1a62cc79b137cf9 upstream.
Add support for the Trace Hub in Meteor Lake-S.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20240429130119.1518073-14-alexander.shishkin@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/hwtracing/intel_th/pci.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/drivers/hwtracing/intel_th/pci.c
+++ b/drivers/hwtracing/intel_th/pci.c
@@ -295,6 +295,11 @@ static const struct pci_device_id intel_
.driver_data = (kernel_ulong_t)&intel_th_2x,
},
{
+ /* Meteor Lake-S */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x7f26),
+ .driver_data = (kernel_ulong_t)&intel_th_2x,
+ },
+ {
/* Raptor Lake-S */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x7a26),
.driver_data = (kernel_ulong_t)&intel_th_2x,
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 223/267] intel_th: pci: Add Lunar Lake support
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (221 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 222/267] intel_th: pci: Add Meteor Lake-S support Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 224/267] pmdomain: ti-sci: Fix duplicate PD referrals Greg Kroah-Hartman
` (55 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Alexander Shishkin, Andy Shevchenko,
stable
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
commit f866b65322bfbc8fcca13c25f49e1a5c5a93ae4d upstream.
Add support for the Trace Hub in Lunar Lake.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: stable@kernel.org
Link: https://lore.kernel.org/r/20240429130119.1518073-16-alexander.shishkin@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/hwtracing/intel_th/pci.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/drivers/hwtracing/intel_th/pci.c
+++ b/drivers/hwtracing/intel_th/pci.c
@@ -325,6 +325,11 @@ static const struct pci_device_id intel_
.driver_data = (kernel_ulong_t)&intel_th_2x,
},
{
+ /* Lunar Lake */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xa824),
+ .driver_data = (kernel_ulong_t)&intel_th_2x,
+ },
+ {
/* Alder Lake CPU */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x466f),
.driver_data = (kernel_ulong_t)&intel_th_2x,
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 224/267] pmdomain: ti-sci: Fix duplicate PD referrals
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (222 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 223/267] intel_th: pci: Add Lunar Lake support Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 225/267] btrfs: zoned: introduce a zone_info struct in btrfs_load_block_group_zone_info Greg Kroah-Hartman
` (54 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Tomi Valkeinen, Ulf Hansson
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
commit 670c900f69645db394efb38934b3344d8804171a upstream.
When the dts file has multiple referrers to a single PD (e.g.
simple-framebuffer and dss nodes both point to the DSS power-domain) the
ti-sci driver will create two power domains, both with the same ID, and
that will cause problems as one of the power domains will hide the other
one.
Fix this checking if a PD with the ID has already been created, and only
create a PD for new IDs.
Fixes: efa5c01cd7ee ("soc: ti: ti_sci_pm_domains: switch to use multiple genpds instead of one")
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240415-ti-sci-pd-v1-1-a0e56b8ad897@ideasonboard.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/pmdomain/ti/ti_sci_pm_domains.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)
--- a/drivers/pmdomain/ti/ti_sci_pm_domains.c
+++ b/drivers/pmdomain/ti/ti_sci_pm_domains.c
@@ -114,6 +114,18 @@ static const struct of_device_id ti_sci_
};
MODULE_DEVICE_TABLE(of, ti_sci_pm_domain_matches);
+static bool ti_sci_pm_idx_exists(struct ti_sci_genpd_provider *pd_provider, u32 idx)
+{
+ struct ti_sci_pm_domain *pd;
+
+ list_for_each_entry(pd, &pd_provider->pd_list, node) {
+ if (pd->idx == idx)
+ return true;
+ }
+
+ return false;
+}
+
static int ti_sci_pm_domain_probe(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
@@ -149,8 +161,14 @@ static int ti_sci_pm_domain_probe(struct
break;
if (args.args_count >= 1 && args.np == dev->of_node) {
- if (args.args[0] > max_id)
+ if (args.args[0] > max_id) {
max_id = args.args[0];
+ } else {
+ if (ti_sci_pm_idx_exists(pd_provider, args.args[0])) {
+ index++;
+ continue;
+ }
+ }
pd = devm_kzalloc(dev, sizeof(*pd), GFP_KERNEL);
if (!pd)
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 225/267] btrfs: zoned: introduce a zone_info struct in btrfs_load_block_group_zone_info
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (223 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 224/267] pmdomain: ti-sci: Fix duplicate PD referrals Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 226/267] btrfs: zoned: factor out per-zone logic from btrfs_load_block_group_zone_info Greg Kroah-Hartman
` (53 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Johannes Thumshirn,
Christoph Hellwig, David Sterba
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig <hch@lst.de>
commit 15c12fcc50a1b12a747f8b6ec05cdb18c537a4d1 upstream.
Add a new zone_info structure to hold per-zone information in
btrfs_load_block_group_zone_info and prepare for breaking out helpers
from it.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/btrfs/zoned.c | 84 ++++++++++++++++++++++++-------------------------------
1 file changed, 37 insertions(+), 47 deletions(-)
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -1282,6 +1282,12 @@ out:
return ret;
}
+struct zone_info {
+ u64 physical;
+ u64 capacity;
+ u64 alloc_offset;
+};
+
int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new)
{
struct btrfs_fs_info *fs_info = cache->fs_info;
@@ -1291,12 +1297,10 @@ int btrfs_load_block_group_zone_info(str
struct btrfs_device *device;
u64 logical = cache->start;
u64 length = cache->length;
+ struct zone_info *zone_info = NULL;
int ret;
int i;
unsigned int nofs_flag;
- u64 *alloc_offsets = NULL;
- u64 *caps = NULL;
- u64 *physical = NULL;
unsigned long *active = NULL;
u64 last_alloc = 0;
u32 num_sequential = 0, num_conventional = 0;
@@ -1328,20 +1332,8 @@ int btrfs_load_block_group_zone_info(str
goto out;
}
- alloc_offsets = kcalloc(map->num_stripes, sizeof(*alloc_offsets), GFP_NOFS);
- if (!alloc_offsets) {
- ret = -ENOMEM;
- goto out;
- }
-
- caps = kcalloc(map->num_stripes, sizeof(*caps), GFP_NOFS);
- if (!caps) {
- ret = -ENOMEM;
- goto out;
- }
-
- physical = kcalloc(map->num_stripes, sizeof(*physical), GFP_NOFS);
- if (!physical) {
+ zone_info = kcalloc(map->num_stripes, sizeof(*zone_info), GFP_NOFS);
+ if (!zone_info) {
ret = -ENOMEM;
goto out;
}
@@ -1353,20 +1345,21 @@ int btrfs_load_block_group_zone_info(str
}
for (i = 0; i < map->num_stripes; i++) {
+ struct zone_info *info = &zone_info[i];
bool is_sequential;
struct blk_zone zone;
struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace;
int dev_replace_is_ongoing = 0;
device = map->stripes[i].dev;
- physical[i] = map->stripes[i].physical;
+ info->physical = map->stripes[i].physical;
if (device->bdev == NULL) {
- alloc_offsets[i] = WP_MISSING_DEV;
+ info->alloc_offset = WP_MISSING_DEV;
continue;
}
- is_sequential = btrfs_dev_is_sequential(device, physical[i]);
+ is_sequential = btrfs_dev_is_sequential(device, info->physical);
if (is_sequential)
num_sequential++;
else
@@ -1380,7 +1373,7 @@ int btrfs_load_block_group_zone_info(str
__set_bit(i, active);
if (!is_sequential) {
- alloc_offsets[i] = WP_CONVENTIONAL;
+ info->alloc_offset = WP_CONVENTIONAL;
continue;
}
@@ -1388,25 +1381,25 @@ int btrfs_load_block_group_zone_info(str
* This zone will be used for allocation, so mark this zone
* non-empty.
*/
- btrfs_dev_clear_zone_empty(device, physical[i]);
+ btrfs_dev_clear_zone_empty(device, info->physical);
down_read(&dev_replace->rwsem);
dev_replace_is_ongoing = btrfs_dev_replace_is_ongoing(dev_replace);
if (dev_replace_is_ongoing && dev_replace->tgtdev != NULL)
- btrfs_dev_clear_zone_empty(dev_replace->tgtdev, physical[i]);
+ btrfs_dev_clear_zone_empty(dev_replace->tgtdev, info->physical);
up_read(&dev_replace->rwsem);
/*
* The group is mapped to a sequential zone. Get the zone write
* pointer to determine the allocation offset within the zone.
*/
- WARN_ON(!IS_ALIGNED(physical[i], fs_info->zone_size));
+ WARN_ON(!IS_ALIGNED(info->physical, fs_info->zone_size));
nofs_flag = memalloc_nofs_save();
- ret = btrfs_get_dev_zone(device, physical[i], &zone);
+ ret = btrfs_get_dev_zone(device, info->physical, &zone);
memalloc_nofs_restore(nofs_flag);
if (ret == -EIO || ret == -EOPNOTSUPP) {
ret = 0;
- alloc_offsets[i] = WP_MISSING_DEV;
+ info->alloc_offset = WP_MISSING_DEV;
continue;
} else if (ret) {
goto out;
@@ -1421,27 +1414,26 @@ int btrfs_load_block_group_zone_info(str
goto out;
}
- caps[i] = (zone.capacity << SECTOR_SHIFT);
+ info->capacity = (zone.capacity << SECTOR_SHIFT);
switch (zone.cond) {
case BLK_ZONE_COND_OFFLINE:
case BLK_ZONE_COND_READONLY:
btrfs_err(fs_info,
"zoned: offline/readonly zone %llu on device %s (devid %llu)",
- physical[i] >> device->zone_info->zone_size_shift,
+ info->physical >> device->zone_info->zone_size_shift,
rcu_str_deref(device->name), device->devid);
- alloc_offsets[i] = WP_MISSING_DEV;
+ info->alloc_offset = WP_MISSING_DEV;
break;
case BLK_ZONE_COND_EMPTY:
- alloc_offsets[i] = 0;
+ info->alloc_offset = 0;
break;
case BLK_ZONE_COND_FULL:
- alloc_offsets[i] = caps[i];
+ info->alloc_offset = info->capacity;
break;
default:
/* Partially used zone */
- alloc_offsets[i] =
- ((zone.wp - zone.start) << SECTOR_SHIFT);
+ info->alloc_offset = ((zone.wp - zone.start) << SECTOR_SHIFT);
__set_bit(i, active);
break;
}
@@ -1468,15 +1460,15 @@ int btrfs_load_block_group_zone_info(str
switch (map->type & BTRFS_BLOCK_GROUP_PROFILE_MASK) {
case 0: /* single */
- if (alloc_offsets[0] == WP_MISSING_DEV) {
+ if (zone_info[0].alloc_offset == WP_MISSING_DEV) {
btrfs_err(fs_info,
"zoned: cannot recover write pointer for zone %llu",
- physical[0]);
+ zone_info[0].physical);
ret = -EIO;
goto out;
}
- cache->alloc_offset = alloc_offsets[0];
- cache->zone_capacity = caps[0];
+ cache->alloc_offset = zone_info[0].alloc_offset;
+ cache->zone_capacity = zone_info[0].capacity;
if (test_bit(0, active))
set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &cache->runtime_flags);
break;
@@ -1486,21 +1478,21 @@ int btrfs_load_block_group_zone_info(str
ret = -EINVAL;
goto out;
}
- if (alloc_offsets[0] == WP_MISSING_DEV) {
+ if (zone_info[0].alloc_offset == WP_MISSING_DEV) {
btrfs_err(fs_info,
"zoned: cannot recover write pointer for zone %llu",
- physical[0]);
+ zone_info[0].physical);
ret = -EIO;
goto out;
}
- if (alloc_offsets[1] == WP_MISSING_DEV) {
+ if (zone_info[1].alloc_offset == WP_MISSING_DEV) {
btrfs_err(fs_info,
"zoned: cannot recover write pointer for zone %llu",
- physical[1]);
+ zone_info[1].physical);
ret = -EIO;
goto out;
}
- if (alloc_offsets[0] != alloc_offsets[1]) {
+ if (zone_info[0].alloc_offset != zone_info[1].alloc_offset) {
btrfs_err(fs_info,
"zoned: write pointer offset mismatch of zones in DUP profile");
ret = -EIO;
@@ -1516,8 +1508,8 @@ int btrfs_load_block_group_zone_info(str
set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE,
&cache->runtime_flags);
}
- cache->alloc_offset = alloc_offsets[0];
- cache->zone_capacity = min(caps[0], caps[1]);
+ cache->alloc_offset = zone_info[0].alloc_offset;
+ cache->zone_capacity = min(zone_info[0].capacity, zone_info[1].capacity);
break;
case BTRFS_BLOCK_GROUP_RAID1:
case BTRFS_BLOCK_GROUP_RAID0:
@@ -1570,9 +1562,7 @@ out:
cache->physical_map = NULL;
}
bitmap_free(active);
- kfree(physical);
- kfree(caps);
- kfree(alloc_offsets);
+ kfree(zone_info);
free_extent_map(em);
return ret;
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 226/267] btrfs: zoned: factor out per-zone logic from btrfs_load_block_group_zone_info
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (224 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 225/267] btrfs: zoned: introduce a zone_info struct in btrfs_load_block_group_zone_info Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 227/267] btrfs: zoned: factor out single bg handling " Greg Kroah-Hartman
` (52 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Johannes Thumshirn,
Christoph Hellwig, David Sterba
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig <hch@lst.de>
commit 09a46725cc84165af452d978a3532d6b97a28796 upstream.
Split out a helper for the body of the per-zone loop in
btrfs_load_block_group_zone_info to make the function easier to read and
modify.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/btrfs/zoned.c | 184 +++++++++++++++++++++++++++----------------------------
1 file changed, 92 insertions(+), 92 deletions(-)
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -1288,19 +1288,103 @@ struct zone_info {
u64 alloc_offset;
};
+static int btrfs_load_zone_info(struct btrfs_fs_info *fs_info, int zone_idx,
+ struct zone_info *info, unsigned long *active,
+ struct map_lookup *map)
+{
+ struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace;
+ struct btrfs_device *device = map->stripes[zone_idx].dev;
+ int dev_replace_is_ongoing = 0;
+ unsigned int nofs_flag;
+ struct blk_zone zone;
+ int ret;
+
+ info->physical = map->stripes[zone_idx].physical;
+
+ if (!device->bdev) {
+ info->alloc_offset = WP_MISSING_DEV;
+ return 0;
+ }
+
+ /* Consider a zone as active if we can allow any number of active zones. */
+ if (!device->zone_info->max_active_zones)
+ __set_bit(zone_idx, active);
+
+ if (!btrfs_dev_is_sequential(device, info->physical)) {
+ info->alloc_offset = WP_CONVENTIONAL;
+ return 0;
+ }
+
+ /* This zone will be used for allocation, so mark this zone non-empty. */
+ btrfs_dev_clear_zone_empty(device, info->physical);
+
+ down_read(&dev_replace->rwsem);
+ dev_replace_is_ongoing = btrfs_dev_replace_is_ongoing(dev_replace);
+ if (dev_replace_is_ongoing && dev_replace->tgtdev != NULL)
+ btrfs_dev_clear_zone_empty(dev_replace->tgtdev, info->physical);
+ up_read(&dev_replace->rwsem);
+
+ /*
+ * The group is mapped to a sequential zone. Get the zone write pointer
+ * to determine the allocation offset within the zone.
+ */
+ WARN_ON(!IS_ALIGNED(info->physical, fs_info->zone_size));
+ nofs_flag = memalloc_nofs_save();
+ ret = btrfs_get_dev_zone(device, info->physical, &zone);
+ memalloc_nofs_restore(nofs_flag);
+ if (ret) {
+ if (ret != -EIO && ret != -EOPNOTSUPP)
+ return ret;
+ info->alloc_offset = WP_MISSING_DEV;
+ return 0;
+ }
+
+ if (zone.type == BLK_ZONE_TYPE_CONVENTIONAL) {
+ btrfs_err_in_rcu(fs_info,
+ "zoned: unexpected conventional zone %llu on device %s (devid %llu)",
+ zone.start << SECTOR_SHIFT, rcu_str_deref(device->name),
+ device->devid);
+ return -EIO;
+ }
+
+ info->capacity = (zone.capacity << SECTOR_SHIFT);
+
+ switch (zone.cond) {
+ case BLK_ZONE_COND_OFFLINE:
+ case BLK_ZONE_COND_READONLY:
+ btrfs_err(fs_info,
+ "zoned: offline/readonly zone %llu on device %s (devid %llu)",
+ (info->physical >> device->zone_info->zone_size_shift),
+ rcu_str_deref(device->name), device->devid);
+ info->alloc_offset = WP_MISSING_DEV;
+ break;
+ case BLK_ZONE_COND_EMPTY:
+ info->alloc_offset = 0;
+ break;
+ case BLK_ZONE_COND_FULL:
+ info->alloc_offset = info->capacity;
+ break;
+ default:
+ /* Partially used zone. */
+ info->alloc_offset = ((zone.wp - zone.start) << SECTOR_SHIFT);
+ __set_bit(zone_idx, active);
+ break;
+ }
+
+ return 0;
+}
+
int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new)
{
struct btrfs_fs_info *fs_info = cache->fs_info;
struct extent_map_tree *em_tree = &fs_info->mapping_tree;
struct extent_map *em;
struct map_lookup *map;
- struct btrfs_device *device;
u64 logical = cache->start;
u64 length = cache->length;
struct zone_info *zone_info = NULL;
int ret;
int i;
- unsigned int nofs_flag;
unsigned long *active = NULL;
u64 last_alloc = 0;
u32 num_sequential = 0, num_conventional = 0;
@@ -1345,98 +1429,14 @@ int btrfs_load_block_group_zone_info(str
}
for (i = 0; i < map->num_stripes; i++) {
- struct zone_info *info = &zone_info[i];
- bool is_sequential;
- struct blk_zone zone;
- struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace;
- int dev_replace_is_ongoing = 0;
-
- device = map->stripes[i].dev;
- info->physical = map->stripes[i].physical;
-
- if (device->bdev == NULL) {
- info->alloc_offset = WP_MISSING_DEV;
- continue;
- }
-
- is_sequential = btrfs_dev_is_sequential(device, info->physical);
- if (is_sequential)
- num_sequential++;
- else
- num_conventional++;
-
- /*
- * Consider a zone as active if we can allow any number of
- * active zones.
- */
- if (!device->zone_info->max_active_zones)
- __set_bit(i, active);
-
- if (!is_sequential) {
- info->alloc_offset = WP_CONVENTIONAL;
- continue;
- }
-
- /*
- * This zone will be used for allocation, so mark this zone
- * non-empty.
- */
- btrfs_dev_clear_zone_empty(device, info->physical);
-
- down_read(&dev_replace->rwsem);
- dev_replace_is_ongoing = btrfs_dev_replace_is_ongoing(dev_replace);
- if (dev_replace_is_ongoing && dev_replace->tgtdev != NULL)
- btrfs_dev_clear_zone_empty(dev_replace->tgtdev, info->physical);
- up_read(&dev_replace->rwsem);
-
- /*
- * The group is mapped to a sequential zone. Get the zone write
- * pointer to determine the allocation offset within the zone.
- */
- WARN_ON(!IS_ALIGNED(info->physical, fs_info->zone_size));
- nofs_flag = memalloc_nofs_save();
- ret = btrfs_get_dev_zone(device, info->physical, &zone);
- memalloc_nofs_restore(nofs_flag);
- if (ret == -EIO || ret == -EOPNOTSUPP) {
- ret = 0;
- info->alloc_offset = WP_MISSING_DEV;
- continue;
- } else if (ret) {
+ ret = btrfs_load_zone_info(fs_info, i, &zone_info[i], active, map);
+ if (ret)
goto out;
- }
-
- if (zone.type == BLK_ZONE_TYPE_CONVENTIONAL) {
- btrfs_err_in_rcu(fs_info,
- "zoned: unexpected conventional zone %llu on device %s (devid %llu)",
- zone.start << SECTOR_SHIFT,
- rcu_str_deref(device->name), device->devid);
- ret = -EIO;
- goto out;
- }
-
- info->capacity = (zone.capacity << SECTOR_SHIFT);
- switch (zone.cond) {
- case BLK_ZONE_COND_OFFLINE:
- case BLK_ZONE_COND_READONLY:
- btrfs_err(fs_info,
- "zoned: offline/readonly zone %llu on device %s (devid %llu)",
- info->physical >> device->zone_info->zone_size_shift,
- rcu_str_deref(device->name), device->devid);
- info->alloc_offset = WP_MISSING_DEV;
- break;
- case BLK_ZONE_COND_EMPTY:
- info->alloc_offset = 0;
- break;
- case BLK_ZONE_COND_FULL:
- info->alloc_offset = info->capacity;
- break;
- default:
- /* Partially used zone */
- info->alloc_offset = ((zone.wp - zone.start) << SECTOR_SHIFT);
- __set_bit(i, active);
- break;
- }
+ if (zone_info[i].alloc_offset == WP_CONVENTIONAL)
+ num_conventional++;
+ else
+ num_sequential++;
}
if (num_sequential > 0)
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 227/267] btrfs: zoned: factor out single bg handling from btrfs_load_block_group_zone_info
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (225 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 226/267] btrfs: zoned: factor out per-zone logic from btrfs_load_block_group_zone_info Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 228/267] btrfs: zoned: factor out DUP " Greg Kroah-Hartman
` (51 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Johannes Thumshirn,
Christoph Hellwig, David Sterba
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig <hch@lst.de>
commit 9e0e3e74dc6928a0956f4e27e24d473c65887e96 upstream.
Split the code handling a type single block group from
btrfs_load_block_group_zone_info to make the code more readable.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/btrfs/zoned.c | 30 +++++++++++++++++++-----------
1 file changed, 19 insertions(+), 11 deletions(-)
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -1374,6 +1374,24 @@ static int btrfs_load_zone_info(struct b
return 0;
}
+static int btrfs_load_block_group_single(struct btrfs_block_group *bg,
+ struct zone_info *info,
+ unsigned long *active)
+{
+ if (info->alloc_offset == WP_MISSING_DEV) {
+ btrfs_err(bg->fs_info,
+ "zoned: cannot recover write pointer for zone %llu",
+ info->physical);
+ return -EIO;
+ }
+
+ bg->alloc_offset = info->alloc_offset;
+ bg->zone_capacity = info->capacity;
+ if (test_bit(0, active))
+ set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &bg->runtime_flags);
+ return 0;
+}
+
int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new)
{
struct btrfs_fs_info *fs_info = cache->fs_info;
@@ -1460,17 +1478,7 @@ int btrfs_load_block_group_zone_info(str
switch (map->type & BTRFS_BLOCK_GROUP_PROFILE_MASK) {
case 0: /* single */
- if (zone_info[0].alloc_offset == WP_MISSING_DEV) {
- btrfs_err(fs_info,
- "zoned: cannot recover write pointer for zone %llu",
- zone_info[0].physical);
- ret = -EIO;
- goto out;
- }
- cache->alloc_offset = zone_info[0].alloc_offset;
- cache->zone_capacity = zone_info[0].capacity;
- if (test_bit(0, active))
- set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &cache->runtime_flags);
+ ret = btrfs_load_block_group_single(cache, &zone_info[0], active);
break;
case BTRFS_BLOCK_GROUP_DUP:
if (map->type & BTRFS_BLOCK_GROUP_DATA) {
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 228/267] btrfs: zoned: factor out DUP bg handling from btrfs_load_block_group_zone_info
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (226 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 227/267] btrfs: zoned: factor out single bg handling " Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 229/267] btrfs: zoned: fix use-after-free due to race with dev replace Greg Kroah-Hartman
` (50 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Johannes Thumshirn,
Christoph Hellwig, David Sterba
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoph Hellwig <hch@lst.de>
commit 87463f7e0250d471fac41e7c9c45ae21d83b5f85 upstream.
Split the code handling a type DUP block group from
btrfs_load_block_group_zone_info to make the code more readable.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/btrfs/zoned.c | 79 +++++++++++++++++++++++++++++--------------------------
1 file changed, 42 insertions(+), 37 deletions(-)
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -1392,6 +1392,47 @@ static int btrfs_load_block_group_single
return 0;
}
+static int btrfs_load_block_group_dup(struct btrfs_block_group *bg,
+ struct map_lookup *map,
+ struct zone_info *zone_info,
+ unsigned long *active)
+{
+ if (map->type & BTRFS_BLOCK_GROUP_DATA) {
+ btrfs_err(bg->fs_info,
+ "zoned: profile DUP not yet supported on data bg");
+ return -EINVAL;
+ }
+
+ if (zone_info[0].alloc_offset == WP_MISSING_DEV) {
+ btrfs_err(bg->fs_info,
+ "zoned: cannot recover write pointer for zone %llu",
+ zone_info[0].physical);
+ return -EIO;
+ }
+ if (zone_info[1].alloc_offset == WP_MISSING_DEV) {
+ btrfs_err(bg->fs_info,
+ "zoned: cannot recover write pointer for zone %llu",
+ zone_info[1].physical);
+ return -EIO;
+ }
+ if (zone_info[0].alloc_offset != zone_info[1].alloc_offset) {
+ btrfs_err(bg->fs_info,
+ "zoned: write pointer offset mismatch of zones in DUP profile");
+ return -EIO;
+ }
+
+ if (test_bit(0, active) != test_bit(1, active)) {
+ if (!btrfs_zone_activate(bg))
+ return -EIO;
+ } else if (test_bit(0, active)) {
+ set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &bg->runtime_flags);
+ }
+
+ bg->alloc_offset = zone_info[0].alloc_offset;
+ bg->zone_capacity = min(zone_info[0].capacity, zone_info[1].capacity);
+ return 0;
+}
+
int btrfs_load_block_group_zone_info(struct btrfs_block_group *cache, bool new)
{
struct btrfs_fs_info *fs_info = cache->fs_info;
@@ -1481,43 +1522,7 @@ int btrfs_load_block_group_zone_info(str
ret = btrfs_load_block_group_single(cache, &zone_info[0], active);
break;
case BTRFS_BLOCK_GROUP_DUP:
- if (map->type & BTRFS_BLOCK_GROUP_DATA) {
- btrfs_err(fs_info, "zoned: profile DUP not yet supported on data bg");
- ret = -EINVAL;
- goto out;
- }
- if (zone_info[0].alloc_offset == WP_MISSING_DEV) {
- btrfs_err(fs_info,
- "zoned: cannot recover write pointer for zone %llu",
- zone_info[0].physical);
- ret = -EIO;
- goto out;
- }
- if (zone_info[1].alloc_offset == WP_MISSING_DEV) {
- btrfs_err(fs_info,
- "zoned: cannot recover write pointer for zone %llu",
- zone_info[1].physical);
- ret = -EIO;
- goto out;
- }
- if (zone_info[0].alloc_offset != zone_info[1].alloc_offset) {
- btrfs_err(fs_info,
- "zoned: write pointer offset mismatch of zones in DUP profile");
- ret = -EIO;
- goto out;
- }
- if (test_bit(0, active) != test_bit(1, active)) {
- if (!btrfs_zone_activate(cache)) {
- ret = -EIO;
- goto out;
- }
- } else {
- if (test_bit(0, active))
- set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE,
- &cache->runtime_flags);
- }
- cache->alloc_offset = zone_info[0].alloc_offset;
- cache->zone_capacity = min(zone_info[0].capacity, zone_info[1].capacity);
+ ret = btrfs_load_block_group_dup(cache, map, zone_info, active);
break;
case BTRFS_BLOCK_GROUP_RAID1:
case BTRFS_BLOCK_GROUP_RAID0:
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 229/267] btrfs: zoned: fix use-after-free due to race with dev replace
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (227 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 228/267] btrfs: zoned: factor out DUP " Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 230/267] xfs: fix imprecise logic in xchk_btree_check_block_owner Greg Kroah-Hartman
` (49 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Johannes Thumshirn, Filipe Manana,
David Sterba
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana <fdmanana@suse.com>
commit 0090d6e1b210551e63cf43958dc7a1ec942cdde9 upstream.
While loading a zone's info during creation of a block group, we can race
with a device replace operation and then trigger a use-after-free on the
device that was just replaced (source device of the replace operation).
This happens because at btrfs_load_zone_info() we extract a device from
the chunk map into a local variable and then use the device while not
under the protection of the device replace rwsem. So if there's a device
replace operation happening when we extract the device and that device
is the source of the replace operation, we will trigger a use-after-free
if before we finish using the device the replace operation finishes and
frees the device.
Fix this by enlarging the critical section under the protection of the
device replace rwsem so that all uses of the device are done inside the
critical section.
CC: stable@vger.kernel.org # 6.1.x: 15c12fcc50a1: btrfs: zoned: introduce a zone_info struct in btrfs_load_block_group_zone_info
CC: stable@vger.kernel.org # 6.1.x: 09a46725cc84: btrfs: zoned: factor out per-zone logic from btrfs_load_block_group_zone_info
CC: stable@vger.kernel.org # 6.1.x: 9e0e3e74dc69: btrfs: zoned: factor out single bg handling from btrfs_load_block_group_zone_info
CC: stable@vger.kernel.org # 6.1.x: 87463f7e0250: btrfs: zoned: factor out DUP bg handling from btrfs_load_block_group_zone_info
CC: stable@vger.kernel.org # 6.1.x
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/btrfs/zoned.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -1293,7 +1293,7 @@ static int btrfs_load_zone_info(struct b
struct map_lookup *map)
{
struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace;
- struct btrfs_device *device = map->stripes[zone_idx].dev;
+ struct btrfs_device *device;
int dev_replace_is_ongoing = 0;
unsigned int nofs_flag;
struct blk_zone zone;
@@ -1301,7 +1301,11 @@ static int btrfs_load_zone_info(struct b
info->physical = map->stripes[zone_idx].physical;
+ down_read(&dev_replace->rwsem);
+ device = map->stripes[zone_idx].dev;
+
if (!device->bdev) {
+ up_read(&dev_replace->rwsem);
info->alloc_offset = WP_MISSING_DEV;
return 0;
}
@@ -1311,6 +1315,7 @@ static int btrfs_load_zone_info(struct b
__set_bit(zone_idx, active);
if (!btrfs_dev_is_sequential(device, info->physical)) {
+ up_read(&dev_replace->rwsem);
info->alloc_offset = WP_CONVENTIONAL;
return 0;
}
@@ -1318,11 +1323,9 @@ static int btrfs_load_zone_info(struct b
/* This zone will be used for allocation, so mark this zone non-empty. */
btrfs_dev_clear_zone_empty(device, info->physical);
- down_read(&dev_replace->rwsem);
dev_replace_is_ongoing = btrfs_dev_replace_is_ongoing(dev_replace);
if (dev_replace_is_ongoing && dev_replace->tgtdev != NULL)
btrfs_dev_clear_zone_empty(dev_replace->tgtdev, info->physical);
- up_read(&dev_replace->rwsem);
/*
* The group is mapped to a sequential zone. Get the zone write pointer
@@ -1333,6 +1336,7 @@ static int btrfs_load_zone_info(struct b
ret = btrfs_get_dev_zone(device, info->physical, &zone);
memalloc_nofs_restore(nofs_flag);
if (ret) {
+ up_read(&dev_replace->rwsem);
if (ret != -EIO && ret != -EOPNOTSUPP)
return ret;
info->alloc_offset = WP_MISSING_DEV;
@@ -1344,6 +1348,7 @@ static int btrfs_load_zone_info(struct b
"zoned: unexpected conventional zone %llu on device %s (devid %llu)",
zone.start << SECTOR_SHIFT, rcu_str_deref(device->name),
device->devid);
+ up_read(&dev_replace->rwsem);
return -EIO;
}
@@ -1371,6 +1376,8 @@ static int btrfs_load_zone_info(struct b
break;
}
+ up_read(&dev_replace->rwsem);
+
return 0;
}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 230/267] xfs: fix imprecise logic in xchk_btree_check_block_owner
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (228 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 229/267] btrfs: zoned: fix use-after-free due to race with dev replace Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 231/267] xfs: fix scrub stats file permissions Greg Kroah-Hartman
` (48 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, linux-xfs, Darrick J. Wong,
Christoph Hellwig, Catherine Hoang
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Darrick J. Wong" <djwong@kernel.org>
commit c0afba9a8363f17d4efed22a8764df33389aebe8 upstream.
A reviewer was confused by the init_sa logic in this function. Upon
checking the logic, I discovered that the code is imprecise. What we
want to do here is check that there is an ownership record in the rmap
btree for the AG that contains a btree block.
For an inode-rooted btree (e.g. the bmbt) the per-AG btree cursors have
not been initialized because inode btrees can span multiple AGs.
Therefore, we must initialize the per-AG btree cursors in sc->sa before
proceeding. That is what init_sa controls, and hence the logic should
be gated on XFS_BTREE_ROOT_IN_INODE, not XFS_BTREE_LONG_PTRS.
In practice, ROOT_IN_INODE and LONG_PTRS are coincident so this hasn't
mattered. However, we're about to refactor both of those flags into
separate btree_ops fields so we want this the logic to make sense
afterwards.
Fixes: 858333dcf021a ("xfs: check btree block ownership with bnobt/rmapbt when scrubbing btree")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/xfs/scrub/btree.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
--- a/fs/xfs/scrub/btree.c
+++ b/fs/xfs/scrub/btree.c
@@ -385,7 +385,12 @@ xchk_btree_check_block_owner(
agno = xfs_daddr_to_agno(bs->cur->bc_mp, daddr);
agbno = xfs_daddr_to_agbno(bs->cur->bc_mp, daddr);
- init_sa = bs->cur->bc_flags & XFS_BTREE_LONG_PTRS;
+ /*
+ * If the btree being examined is not itself a per-AG btree, initialize
+ * sc->sa so that we can check for the presence of an ownership record
+ * in the rmap btree for the AG containing the block.
+ */
+ init_sa = bs->cur->bc_flags & XFS_BTREE_ROOT_IN_INODE;
if (init_sa) {
error = xchk_ag_init_existing(bs->sc, agno, &bs->sc->sa);
if (!xchk_btree_xref_process_error(bs->sc, bs->cur,
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 231/267] xfs: fix scrub stats file permissions
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (229 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 230/267] xfs: fix imprecise logic in xchk_btree_check_block_owner Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 232/267] xfs: fix SEEK_HOLE/DATA for regions with active COW extents Greg Kroah-Hartman
` (47 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, linux-xfs, Darrick J. Wong,
Christoph Hellwig, Chandan Babu R, Catherine Hoang
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Darrick J. Wong" <djwong@kernel.org>
commit e610e856b938a1fc86e7ee83ad2f39716082bca7 upstream.
When the kernel is in lockdown mode, debugfs will only show files that
are world-readable and cannot be written, mmaped, or used with ioctl.
That more or less describes the scrub stats file, except that the
permissions are wrong -- they should be 0444, not 0644. You can't write
the stats file, so the 0200 makes no sense.
Meanwhile, the clear_stats file is only writable, but it got mode 0400
instead of 0200, which would make more sense.
Fix both files so that they make sense.
Fixes: d7a74cad8f451 ("xfs: track usage statistics of online fsck")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/xfs/scrub/stats.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/xfs/scrub/stats.c
+++ b/fs/xfs/scrub/stats.c
@@ -329,9 +329,9 @@ xchk_stats_register(
if (!cs->cs_debugfs)
return;
- debugfs_create_file("stats", 0644, cs->cs_debugfs, cs,
+ debugfs_create_file("stats", 0444, cs->cs_debugfs, cs,
&scrub_stats_fops);
- debugfs_create_file("clear_stats", 0400, cs->cs_debugfs, cs,
+ debugfs_create_file("clear_stats", 0200, cs->cs_debugfs, cs,
&clear_scrub_stats_fops);
}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 232/267] xfs: fix SEEK_HOLE/DATA for regions with active COW extents
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (230 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 231/267] xfs: fix scrub stats file permissions Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 233/267] xfs: shrink failure needs to hold AGI buffer Greg Kroah-Hartman
` (46 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, linux-xfs, Dave Chinner,
Darrick J. Wong, Christoph Hellwig, Chandan Babu R,
Catherine Hoang
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dave Chinner <dchinner@redhat.com>
commit 4b2f459d86252619448455013f581836c8b1b7da upstream.
A data corruption problem was reported by CoreOS image builders
when using reflink based disk image copies and then converting
them to qcow2 images. The converted images failed the conversion
verification step, and it was isolated down to the fact that
qemu-img uses SEEK_HOLE/SEEK_DATA to find the data it is supposed to
copy.
The reproducer allowed me to isolate the issue down to a region of
the file that had overlapping data and COW fork extents, and the
problem was that the COW fork extent was being reported in it's
entirity by xfs_seek_iomap_begin() and so skipping over the real
data fork extents in that range.
This was somewhat hidden by the fact that 'xfs_bmap -vvp' reported
all the extents correctly, and reading the file completely (i.e. not
using seek to skip holes) would map the file correctly and all the
correct data extents are read. Hence the problem is isolated to just
the xfs_seek_iomap_begin() implementation.
Instrumentation with trace_printk made the problem obvious: we are
passing the wrong length to xfs_trim_extent() in
xfs_seek_iomap_begin(). We are passing the end_fsb, not the
maximum length of the extent we want to trim the map too. Hence the
COW extent map never gets trimmed to the start of the next data fork
extent, and so the seek code treats the entire COW fork extent as
unwritten and skips entirely over the data fork extents in that
range.
Link: https://github.com/coreos/coreos-assembler/issues/3728
Fixes: 60271ab79d40 ("xfs: fix SEEK_DATA for speculative COW fork preallocation")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/xfs/xfs_iomap.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1323,7 +1323,7 @@ xfs_seek_iomap_begin(
if (cow_fsb != NULLFILEOFF && cow_fsb <= offset_fsb) {
if (data_fsb < cow_fsb + cmap.br_blockcount)
end_fsb = min(end_fsb, data_fsb);
- xfs_trim_extent(&cmap, offset_fsb, end_fsb);
+ xfs_trim_extent(&cmap, offset_fsb, end_fsb - offset_fsb);
seq = xfs_iomap_inode_sequence(ip, IOMAP_F_SHARED);
error = xfs_bmbt_to_iomap(ip, iomap, &cmap, flags,
IOMAP_F_SHARED, seq);
@@ -1348,7 +1348,7 @@ xfs_seek_iomap_begin(
imap.br_state = XFS_EXT_NORM;
done:
seq = xfs_iomap_inode_sequence(ip, 0);
- xfs_trim_extent(&imap, offset_fsb, end_fsb);
+ xfs_trim_extent(&imap, offset_fsb, end_fsb - offset_fsb);
error = xfs_bmbt_to_iomap(ip, iomap, &imap, flags, 0, seq);
out_unlock:
xfs_iunlock(ip, lockmode);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 233/267] xfs: shrink failure needs to hold AGI buffer
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (231 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 232/267] xfs: fix SEEK_HOLE/DATA for regions with active COW extents Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 234/267] xfs: ensure submit buffers on LSN boundaries in error handlers Greg Kroah-Hartman
` (45 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, linux-xfs, Chandan Babu R,
Dave Chinner, Gao Xiang, Christoph Hellwig, Catherine Hoang,
Darrick J. Wong
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dave Chinner <dchinner@redhat.com>
commit 75bcffbb9e7563259b7aed0fa77459d6a3a35627 upstream.
Chandan reported a AGI/AGF lock order hang on xfs/168 during recent
testing. The cause of the problem was the task running xfs_growfs
to shrink the filesystem. A failure occurred trying to remove the
free space from the btrees that the shrink would make disappear,
and that meant it ran the error handling for a partial failure.
This error path involves restoring the per-ag block reservations,
and that requires calculating the amount of space needed to be
reserved for the free inode btree. The growfs operation hung here:
[18679.536829] down+0x71/0xa0
[18679.537657] xfs_buf_lock+0xa4/0x290 [xfs]
[18679.538731] xfs_buf_find_lock+0xf7/0x4d0 [xfs]
[18679.539920] xfs_buf_lookup.constprop.0+0x289/0x500 [xfs]
[18679.542628] xfs_buf_get_map+0x2b3/0xe40 [xfs]
[18679.547076] xfs_buf_read_map+0xbb/0x900 [xfs]
[18679.562616] xfs_trans_read_buf_map+0x449/0xb10 [xfs]
[18679.569778] xfs_read_agi+0x1cd/0x500 [xfs]
[18679.573126] xfs_ialloc_read_agi+0xc2/0x5b0 [xfs]
[18679.578708] xfs_finobt_calc_reserves+0xe7/0x4d0 [xfs]
[18679.582480] xfs_ag_resv_init+0x2c5/0x490 [xfs]
[18679.586023] xfs_ag_shrink_space+0x736/0xd30 [xfs]
[18679.590730] xfs_growfs_data_private.isra.0+0x55e/0x990 [xfs]
[18679.599764] xfs_growfs_data+0x2f1/0x410 [xfs]
[18679.602212] xfs_file_ioctl+0xd1e/0x1370 [xfs]
trying to get the AGI lock. The AGI lock was held by a fstress task
trying to do an inode allocation, and it was waiting on the AGF
lock to allocate a new inode chunk on disk. Hence deadlock.
The fix for this is for the growfs code to hold the AGI over the
transaction roll it does in the error path. It already holds the AGF
locked across this, and that is what causes the lock order inversion
in the xfs_ag_resv_init() call.
Reported-by: Chandan Babu R <chandanbabu@kernel.org>
Fixes: 46141dc891f7 ("xfs: introduce xfs_ag_shrink_space()")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/xfs/libxfs/xfs_ag.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
--- a/fs/xfs/libxfs/xfs_ag.c
+++ b/fs/xfs/libxfs/xfs_ag.c
@@ -979,14 +979,23 @@ xfs_ag_shrink_space(
if (error) {
/*
- * if extent allocation fails, need to roll the transaction to
+ * If extent allocation fails, need to roll the transaction to
* ensure that the AGFL fixup has been committed anyway.
+ *
+ * We need to hold the AGF across the roll to ensure nothing can
+ * access the AG for allocation until the shrink is fully
+ * cleaned up. And due to the resetting of the AG block
+ * reservation space needing to lock the AGI, we also have to
+ * hold that so we don't get AGI/AGF lock order inversions in
+ * the error handling path.
*/
xfs_trans_bhold(*tpp, agfbp);
+ xfs_trans_bhold(*tpp, agibp);
err2 = xfs_trans_roll(tpp);
if (err2)
return err2;
xfs_trans_bjoin(*tpp, agfbp);
+ xfs_trans_bjoin(*tpp, agibp);
goto resv_init_out;
}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 234/267] xfs: ensure submit buffers on LSN boundaries in error handlers
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (232 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 233/267] xfs: shrink failure needs to hold AGI buffer Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 235/267] xfs: allow sunit mount option to repair bad primary sb stripe values Greg Kroah-Hartman
` (44 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, linux-xfs, Long Li, Darrick J. Wong,
Chandan Babu R, Catherine Hoang
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Long Li <leo.lilong@huawei.com>
commit e4c3b72a6ea93ed9c1815c74312eee9305638852 upstream.
While performing the IO fault injection test, I caught the following data
corruption report:
XFS (dm-0): Internal error ltbno + ltlen > bno at line 1957 of file fs/xfs/libxfs/xfs_alloc.c. Caller xfs_free_ag_extent+0x79c/0x1130
CPU: 3 PID: 33 Comm: kworker/3:0 Not tainted 6.5.0-rc7-next-20230825-00001-g7f8666926889 #214
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
Workqueue: xfs-inodegc/dm-0 xfs_inodegc_worker
Call Trace:
<TASK>
dump_stack_lvl+0x50/0x70
xfs_corruption_error+0x134/0x150
xfs_free_ag_extent+0x7d3/0x1130
__xfs_free_extent+0x201/0x3c0
xfs_trans_free_extent+0x29b/0xa10
xfs_extent_free_finish_item+0x2a/0xb0
xfs_defer_finish_noroll+0x8d1/0x1b40
xfs_defer_finish+0x21/0x200
xfs_itruncate_extents_flags+0x1cb/0x650
xfs_free_eofblocks+0x18f/0x250
xfs_inactive+0x485/0x570
xfs_inodegc_worker+0x207/0x530
process_scheduled_works+0x24a/0xe10
worker_thread+0x5ac/0xc60
kthread+0x2cd/0x3c0
ret_from_fork+0x4a/0x80
ret_from_fork_asm+0x11/0x20
</TASK>
XFS (dm-0): Corruption detected. Unmount and run xfs_repair
After analyzing the disk image, it was found that the corruption was
triggered by the fact that extent was recorded in both inode datafork
and AGF btree blocks. After a long time of reproduction and analysis,
we found that the reason of free sapce btree corruption was that the
AGF btree was not recovered correctly.
Consider the following situation, Checkpoint A and Checkpoint B are in
the same record and share the same start LSN1, buf items of same object
(AGF btree block) is included in both Checkpoint A and Checkpoint B. If
the buf item in Checkpoint A has been recovered and updates metadata LSN
permanently, then the buf item in Checkpoint B cannot be recovered,
because log recovery skips items with a metadata LSN >= the current LSN
of the recovery item. If there is still an inode item in Checkpoint B
that records the Extent X, the Extent X will be recorded in both inode
datafork and AGF btree block after Checkpoint B is recovered. Such
transaction can be seen when allocing enxtent for inode bmap, it record
both the addition of extent to the inode extent list and the removing
extent from the AGF.
|------------Record (LSN1)------------------|---Record (LSN2)---|
|-------Checkpoint A----------|----------Checkpoint B-----------|
| Buf Item(Extent X) | Buf Item / Inode item(Extent X) |
| Extent X is freed | Extent X is allocated |
After commit 12818d24db8a ("xfs: rework log recovery to submit buffers
on LSN boundaries") was introduced, we submit buffers on lsn boundaries
during log recovery. The above problem can be avoided under normal paths,
but it's not guaranteed under abnormal paths. Consider the following
process, if an error was encountered after recover buf item in Checkpoint
A and before recover buf item in Checkpoint B, buffers that have been
added to the buffer_list will still be submitted, this violates the
submits rule on lsn boundaries. So buf item in Checkpoint B cannot be
recovered on the next mount due to current lsn of transaction equal to
metadata lsn on disk. The detailed process of the problem is as follows.
First Mount:
xlog_do_recovery_pass
error = xlog_recover_process
xlog_recover_process_data
xlog_recover_process_ophdr
xlog_recovery_process_trans
...
/* recover buf item in Checkpoint A */
xlog_recover_buf_commit_pass2
xlog_recover_do_reg_buffer
/* add buffer of agf btree block to buffer_list */
xfs_buf_delwri_queue(bp, buffer_list)
...
==> Encounter read IO error and return
/* submit buffers regardless of error */
if (!list_empty(&buffer_list))
xfs_buf_delwri_submit(&buffer_list);
<buf items of agf btree block in Checkpoint A recovery success>
Second Mount:
xlog_do_recovery_pass
error = xlog_recover_process
xlog_recover_process_data
xlog_recover_process_ophdr
xlog_recovery_process_trans
...
/* recover buf item in Checkpoint B */
xlog_recover_buf_commit_pass2
/* buffer of agf btree block wouldn't added to
buffer_list due to lsn equal to current_lsn */
if (XFS_LSN_CMP(lsn, current_lsn) >= 0)
goto out_release
<buf items of agf btree block in Checkpoint B wouldn't recovery>
In order to make sure that submits buffers on lsn boundaries in the
abnormal paths, we need to check error status before submit buffers that
have been added from the last record processed. If error status exist,
buffers in the bufffer_list should not be writen to disk.
Canceling the buffers in the buffer_list directly isn't correct, unlike
any other place where write list was canceled, these buffers has been
initialized by xfs_buf_item_init() during recovery and held by buf item,
buf items will not be released in xfs_buf_delwri_cancel(), it's not easy
to solve.
If the filesystem has been shut down, then delwri list submission will
error out all buffers on the list via IO submission/completion and do
all the correct cleanup automatically. So shutting down the filesystem
could prevents buffers in the bufffer_list from being written to disk.
Fixes: 50d5c8d8e938 ("xfs: check LSN ordering for v5 superblocks during recovery")
Signed-off-by: Long Li <leo.lilong@huawei.com>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/xfs/xfs_log_recover.c | 23 ++++++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -3203,11 +3203,28 @@ xlog_do_recovery_pass(
kmem_free(hbp);
/*
- * Submit buffers that have been added from the last record processed,
- * regardless of error status.
+ * Submit buffers that have been dirtied by the last record recovered.
*/
- if (!list_empty(&buffer_list))
+ if (!list_empty(&buffer_list)) {
+ if (error) {
+ /*
+ * If there has been an item recovery error then we
+ * cannot allow partial checkpoint writeback to
+ * occur. We might have multiple checkpoints with the
+ * same start LSN in this buffer list, and partial
+ * writeback of a checkpoint in this situation can
+ * prevent future recovery of all the changes in the
+ * checkpoints at this start LSN.
+ *
+ * Note: Shutting down the filesystem will result in the
+ * delwri submission marking all the buffers stale,
+ * completing them and cleaning up _XBF_LOGRECOVERY
+ * state without doing any IO.
+ */
+ xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
+ }
error2 = xfs_buf_delwri_submit(&buffer_list);
+ }
if (error && first_bad)
*first_bad = rhead_blk;
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 235/267] xfs: allow sunit mount option to repair bad primary sb stripe values
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (233 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 234/267] xfs: ensure submit buffers on LSN boundaries in error handlers Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 236/267] xfs: dont use current->journal_info Greg Kroah-Hartman
` (43 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, linux-xfs, Dave Chinner,
Christoph Hellwig, Darrick J. Wong, Chandan Babu R,
Catherine Hoang
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dave Chinner <dchinner@redhat.com>
commit 15922f5dbf51dad334cde888ce6835d377678dc9 upstream.
If a filesystem has a busted stripe alignment configuration on disk
(e.g. because broken RAID firmware told mkfs that swidth was smaller
than sunit), then the filesystem will refuse to mount due to the
stripe validation failing. This failure is triggering during distro
upgrades from old kernels lacking this check to newer kernels with
this check, and currently the only way to fix it is with offline
xfs_db surgery.
This runtime validity checking occurs when we read the superblock
for the first time and causes the mount to fail immediately. This
prevents the rewrite of stripe unit/width via
mount options that occurs later in the mount process. Hence there is
no way to recover this situation without resorting to offline xfs_db
rewrite of the values.
However, we parse the mount options long before we read the
superblock, and we know if the mount has been asked to re-write the
stripe alignment configuration when we are reading the superblock
and verifying it for the first time. Hence we can conditionally
ignore stripe verification failures if the mount options specified
will correct the issue.
We validate that the new stripe unit/width are valid before we
overwrite the superblock values, so we can ignore the invalid config
at verification and fail the mount later if the new values are not
valid. This, at least, gives users the chance of correcting the
issue after a kernel upgrade without having to resort to xfs-db
hacks.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/xfs/libxfs/xfs_sb.c | 40 +++++++++++++++++++++++++++++++---------
fs/xfs/libxfs/xfs_sb.h | 5 +++--
2 files changed, 34 insertions(+), 11 deletions(-)
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -530,7 +530,8 @@ xfs_validate_sb_common(
}
if (!xfs_validate_stripe_geometry(mp, XFS_FSB_TO_B(mp, sbp->sb_unit),
- XFS_FSB_TO_B(mp, sbp->sb_width), 0, false))
+ XFS_FSB_TO_B(mp, sbp->sb_width), 0,
+ xfs_buf_daddr(bp) == XFS_SB_DADDR, false))
return -EFSCORRUPTED;
/*
@@ -1319,8 +1320,10 @@ xfs_sb_get_secondary(
}
/*
- * sunit, swidth, sectorsize(optional with 0) should be all in bytes,
- * so users won't be confused by values in error messages.
+ * sunit, swidth, sectorsize(optional with 0) should be all in bytes, so users
+ * won't be confused by values in error messages. This function returns false
+ * if the stripe geometry is invalid and the caller is unable to repair the
+ * stripe configuration later in the mount process.
*/
bool
xfs_validate_stripe_geometry(
@@ -1328,20 +1331,21 @@ xfs_validate_stripe_geometry(
__s64 sunit,
__s64 swidth,
int sectorsize,
+ bool may_repair,
bool silent)
{
if (swidth > INT_MAX) {
if (!silent)
xfs_notice(mp,
"stripe width (%lld) is too large", swidth);
- return false;
+ goto check_override;
}
if (sunit > swidth) {
if (!silent)
xfs_notice(mp,
"stripe unit (%lld) is larger than the stripe width (%lld)", sunit, swidth);
- return false;
+ goto check_override;
}
if (sectorsize && (int)sunit % sectorsize) {
@@ -1349,21 +1353,21 @@ xfs_validate_stripe_geometry(
xfs_notice(mp,
"stripe unit (%lld) must be a multiple of the sector size (%d)",
sunit, sectorsize);
- return false;
+ goto check_override;
}
if (sunit && !swidth) {
if (!silent)
xfs_notice(mp,
"invalid stripe unit (%lld) and stripe width of 0", sunit);
- return false;
+ goto check_override;
}
if (!sunit && swidth) {
if (!silent)
xfs_notice(mp,
"invalid stripe width (%lld) and stripe unit of 0", swidth);
- return false;
+ goto check_override;
}
if (sunit && (int)swidth % (int)sunit) {
@@ -1371,9 +1375,27 @@ xfs_validate_stripe_geometry(
xfs_notice(mp,
"stripe width (%lld) must be a multiple of the stripe unit (%lld)",
swidth, sunit);
- return false;
+ goto check_override;
}
return true;
+
+check_override:
+ if (!may_repair)
+ return false;
+ /*
+ * During mount, mp->m_dalign will not be set unless the sunit mount
+ * option was set. If it was set, ignore the bad stripe alignment values
+ * and allow the validation and overwrite later in the mount process to
+ * attempt to overwrite the bad stripe alignment values with the values
+ * supplied by mount options.
+ */
+ if (!mp->m_dalign)
+ return false;
+ if (!silent)
+ xfs_notice(mp,
+"Will try to correct with specified mount options sunit (%d) and swidth (%d)",
+ BBTOB(mp->m_dalign), BBTOB(mp->m_swidth));
+ return true;
}
/*
--- a/fs/xfs/libxfs/xfs_sb.h
+++ b/fs/xfs/libxfs/xfs_sb.h
@@ -35,8 +35,9 @@ extern int xfs_sb_get_secondary(struct x
struct xfs_trans *tp, xfs_agnumber_t agno,
struct xfs_buf **bpp);
-extern bool xfs_validate_stripe_geometry(struct xfs_mount *mp,
- __s64 sunit, __s64 swidth, int sectorsize, bool silent);
+bool xfs_validate_stripe_geometry(struct xfs_mount *mp,
+ __s64 sunit, __s64 swidth, int sectorsize, bool may_repair,
+ bool silent);
uint8_t xfs_compute_rextslog(xfs_rtbxlen_t rtextents);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 236/267] xfs: dont use current->journal_info
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (234 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 235/267] xfs: allow sunit mount option to repair bad primary sb stripe values Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 237/267] xfs: allow cross-linking special files without project quota Greg Kroah-Hartman
` (42 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, linux-xfs,
syzbot+cdee56dbcdf0096ef605, Dave Chinner, Darrick J. Wong,
Mark Tinguely, Christoph Hellwig, Chandan Babu R, Catherine Hoang
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dave Chinner <dchinner@redhat.com>
commit f2e812c1522dab847912309b00abcc762dd696da upstream.
syzbot reported an ext4 panic during a page fault where found a
journal handle when it didn't expect to find one. The structure
it tripped over had a value of 'TRAN' in the first entry in the
structure, and that indicates it tripped over a struct xfs_trans
instead of a jbd2 handle.
The reason for this is that the page fault was taken during a
copy-out to a user buffer from an xfs bulkstat operation. XFS uses
an "empty" transaction context for bulkstat to do automated metadata
buffer cleanup, and so the transaction context is valid across the
copyout of the bulkstat info into the user buffer.
We are using empty transaction contexts like this in XFS to reduce
the risk of failing to release objects we reference during the
operation, especially during error handling. Hence we really need to
ensure that we can take page faults from these contexts without
leaving landmines for the code processing the page fault to trip
over.
However, this same behaviour could happen from any other filesystem
that triggers a page fault or any other exception that is handled
on-stack from within a task context that has current->journal_info
set. Having a page fault from some other filesystem bounce into XFS
where we have to run a transaction isn't a bug at all, but the usage
of current->journal_info means that this could result corruption of
the outer task's journal_info structure.
The problem is purely that we now have two different contexts that
now think they own current->journal_info. IOWs, no filesystem can
allow page faults or on-stack exceptions while current->journal_info
is set by the filesystem because the exception processing might use
current->journal_info itself.
If we end up with nested XFS transactions whilst holding an empty
transaction, then it isn't an issue as the outer transaction does
not hold a log reservation. If we ignore the current->journal_info
usage, then the only problem that might occur is a deadlock if the
exception tries to take the same locks the upper context holds.
That, however, is not a problem that setting current->journal_info
would solve, so it's largely an irrelevant concern here.
IOWs, we really only use current->journal_info for a warning check
in xfs_vm_writepages() to ensure we aren't doing writeback from a
transaction context. Writeback might need to do allocation, so it
can need to run transactions itself. Hence it's a debug check to
warn us that we've done something silly, and largely it is not all
that useful.
So let's just remove all the use of current->journal_info in XFS and
get rid of all the potential issues from nested contexts where
current->journal_info might get misused by another filesystem
context.
Reported-by: syzbot+cdee56dbcdf0096ef605@syzkaller.appspotmail.com
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Mark Tinguely <mark.tinguely@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/xfs/scrub/common.c | 4 +---
fs/xfs/xfs_aops.c | 7 -------
fs/xfs/xfs_icache.c | 8 +++++---
fs/xfs/xfs_trans.h | 9 +--------
4 files changed, 7 insertions(+), 21 deletions(-)
--- a/fs/xfs/scrub/common.c
+++ b/fs/xfs/scrub/common.c
@@ -978,9 +978,7 @@ xchk_irele(
struct xfs_scrub *sc,
struct xfs_inode *ip)
{
- if (current->journal_info != NULL) {
- ASSERT(current->journal_info == sc->tp);
-
+ if (sc->tp) {
/*
* If we are in a transaction, we /cannot/ drop the inode
* ourselves, because the VFS will trigger writeback, which
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -502,13 +502,6 @@ xfs_vm_writepages(
{
struct xfs_writepage_ctx wpc = { };
- /*
- * Writing back data in a transaction context can result in recursive
- * transactions. This is bad, so issue a warning and get out of here.
- */
- if (WARN_ON_ONCE(current->journal_info))
- return 0;
-
xfs_iflags_clear(XFS_I(mapping->host), XFS_ITRUNCATED);
return iomap_writepages(mapping, wbc, &wpc.ctx, &xfs_writeback_ops);
}
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -2031,8 +2031,10 @@ xfs_inodegc_want_queue_work(
* - Memory shrinkers queued the inactivation worker and it hasn't finished.
* - The queue depth exceeds the maximum allowable percpu backlog.
*
- * Note: If the current thread is running a transaction, we don't ever want to
- * wait for other transactions because that could introduce a deadlock.
+ * Note: If we are in a NOFS context here (e.g. current thread is running a
+ * transaction) the we don't want to block here as inodegc progress may require
+ * filesystem resources we hold to make progress and that could result in a
+ * deadlock. Hence we skip out of here if we are in a scoped NOFS context.
*/
static inline bool
xfs_inodegc_want_flush_work(
@@ -2040,7 +2042,7 @@ xfs_inodegc_want_flush_work(
unsigned int items,
unsigned int shrinker_hits)
{
- if (current->journal_info)
+ if (current->flags & PF_MEMALLOC_NOFS)
return false;
if (shrinker_hits > 0)
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -277,19 +277,14 @@ static inline void
xfs_trans_set_context(
struct xfs_trans *tp)
{
- ASSERT(current->journal_info == NULL);
tp->t_pflags = memalloc_nofs_save();
- current->journal_info = tp;
}
static inline void
xfs_trans_clear_context(
struct xfs_trans *tp)
{
- if (current->journal_info == tp) {
- memalloc_nofs_restore(tp->t_pflags);
- current->journal_info = NULL;
- }
+ memalloc_nofs_restore(tp->t_pflags);
}
static inline void
@@ -297,10 +292,8 @@ xfs_trans_switch_context(
struct xfs_trans *old_tp,
struct xfs_trans *new_tp)
{
- ASSERT(current->journal_info == old_tp);
new_tp->t_pflags = old_tp->t_pflags;
old_tp->t_pflags = 0;
- current->journal_info = new_tp;
}
#endif /* __XFS_TRANS_H__ */
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 237/267] xfs: allow cross-linking special files without project quota
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (235 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 236/267] xfs: dont use current->journal_info Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 238/267] swiotlb: Enforce page alignment in swiotlb_alloc() Greg Kroah-Hartman
` (41 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, linux-xfs, Andrey Albershteyn,
Darrick J. Wong, Chandan Babu R, Catherine Hoang
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrey Albershteyn <aalbersh@redhat.com>
commit e23d7e82b707d1d0a627e334fb46370e4f772c11 upstream.
There's an issue that if special files is created before quota
project is enabled, then it's not possible to link this file. This
works fine for normal files. This happens because xfs_quota skips
special files (no ioctls to set necessary flags). The check for
having the same project ID for source and destination then fails as
source file doesn't have any ID.
mkfs.xfs -f /dev/sda
mount -o prjquota /dev/sda /mnt/test
mkdir /mnt/test/foo
mkfifo /mnt/test/foo/fifo1
xfs_quota -xc "project -sp /mnt/test/foo 9" /mnt/test
> Setting up project 9 (path /mnt/test/foo)...
> xfs_quota: skipping special file /mnt/test/foo/fifo1
> Processed 1 (/etc/projects and cmdline) paths for project 9 with recursion depth infinite (-1).
ln /mnt/test/foo/fifo1 /mnt/test/foo/fifo1_link
> ln: failed to create hard link '/mnt/test/testdir/fifo1_link' => '/mnt/test/testdir/fifo1': Invalid cross-device link
mkfifo /mnt/test/foo/fifo2
ln /mnt/test/foo/fifo2 /mnt/test/foo/fifo2_link
Fix this by allowing linking of special files to the project quota
if special files doesn't have any ID set (ID = 0).
Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/xfs/xfs_inode.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1239,8 +1239,19 @@ xfs_link(
*/
if (unlikely((tdp->i_diflags & XFS_DIFLAG_PROJINHERIT) &&
tdp->i_projid != sip->i_projid)) {
- error = -EXDEV;
- goto error_return;
+ /*
+ * Project quota setup skips special files which can
+ * leave inodes in a PROJINHERIT directory without a
+ * project ID set. We need to allow links to be made
+ * to these "project-less" inodes because userspace
+ * expects them to succeed after project ID setup,
+ * but everything else should be rejected.
+ */
+ if (!special_file(VFS_I(sip)->i_mode) ||
+ sip->i_projid != 0) {
+ error = -EXDEV;
+ goto error_return;
+ }
}
if (!resblks) {
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 238/267] swiotlb: Enforce page alignment in swiotlb_alloc()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (236 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 237/267] xfs: allow cross-linking special files without project quota Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 239/267] swiotlb: Reinstate page-alignment for mappings >= PAGE_SIZE Greg Kroah-Hartman
` (40 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches,
will@kernel.org, mhklinux@outlook.com, petr.tesarik1@huawei-partners.com, nicolinc@nvidia.com, hch@lst.de, Fabio Estevam,
Will Deacon, Michael Kelley, Petr Tesarik, Nicolin Chen,
Christoph Hellwig, Fabio Estevam
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Will Deacon <will@kernel.org>
commit 823353b7cf0ea9dfb09f5181d5fb2825d727200b upstream.
When allocating pages from a restricted DMA pool in swiotlb_alloc(),
the buffer address is blindly converted to a 'struct page *' that is
returned to the caller. In the unlikely event of an allocation bug,
page-unaligned addresses are not detected and slots can silently be
double-allocated.
Add a simple check of the buffer alignment in swiotlb_alloc() to make
debugging a little easier if something has gone wonky.
Cc: stable@vger.kernel.org # v6.6+
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Petr Tesarik <petr.tesarik1@huawei-partners.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/dma/swiotlb.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -1627,6 +1627,12 @@ struct page *swiotlb_alloc(struct device
return NULL;
tlb_addr = slot_addr(pool->start, index);
+ if (unlikely(!PAGE_ALIGNED(tlb_addr))) {
+ dev_WARN_ONCE(dev, 1, "Cannot allocate pages from non page-aligned swiotlb addr 0x%pa.\n",
+ &tlb_addr);
+ swiotlb_release_slots(dev, tlb_addr);
+ return NULL;
+ }
return pfn_to_page(PFN_DOWN(tlb_addr));
}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 239/267] swiotlb: Reinstate page-alignment for mappings >= PAGE_SIZE
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (237 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 238/267] swiotlb: Enforce page alignment in swiotlb_alloc() Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 240/267] swiotlb: extend buffer pre-padding to alloc_align_mask if necessary Greg Kroah-Hartman
` (39 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Michael Kelley, Will Deacon,
Robin Murphy, Petr Tesarik, Nicolin Chen, Christoph Hellwig,
Fabio Estevam
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Will Deacon <will@kernel.org>
commit 14cebf689a78e8a1c041138af221ef6eac6bc7da upstream.
For swiotlb allocations >= PAGE_SIZE, the slab search historically
adjusted the stride to avoid checking unaligned slots. This had the
side-effect of aligning large mapping requests to PAGE_SIZE, but that
was broken by 0eee5ae10256 ("swiotlb: fix slot alignment checks").
Since this alignment could be relied upon drivers, reinstate PAGE_SIZE
alignment for swiotlb mappings >= PAGE_SIZE.
Cc: stable@vger.kernel.org # v6.6+
Reported-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Petr Tesarik <petr.tesarik1@huawei-partners.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/dma/swiotlb.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -993,6 +993,17 @@ static int swiotlb_area_find_slots(struc
BUG_ON(area_index >= pool->nareas);
/*
+ * Historically, swiotlb allocations >= PAGE_SIZE were guaranteed to be
+ * page-aligned in the absence of any other alignment requirements.
+ * 'alloc_align_mask' was later introduced to specify the alignment
+ * explicitly, however this is passed as zero for streaming mappings
+ * and so we preserve the old behaviour there in case any drivers are
+ * relying on it.
+ */
+ if (!alloc_align_mask && !iotlb_align_mask && alloc_size >= PAGE_SIZE)
+ alloc_align_mask = PAGE_SIZE - 1;
+
+ /*
* Ensure that the allocation is at least slot-aligned and update
* 'iotlb_align_mask' to ignore bits that will be preserved when
* offsetting into the allocation.
@@ -1006,13 +1017,6 @@ static int swiotlb_area_find_slots(struc
*/
stride = get_max_slots(max(alloc_align_mask, iotlb_align_mask));
- /*
- * For allocations of PAGE_SIZE or larger only look for page aligned
- * allocations.
- */
- if (alloc_size >= PAGE_SIZE)
- stride = umax(stride, PAGE_SHIFT - IO_TLB_SHIFT + 1);
-
spin_lock_irqsave(&area->lock, flags);
if (unlikely(nslots > pool->area_nslabs - area->used))
goto not_found;
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 240/267] swiotlb: extend buffer pre-padding to alloc_align_mask if necessary
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (238 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 239/267] swiotlb: Reinstate page-alignment for mappings >= PAGE_SIZE Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 241/267] nilfs2: fix potential kernel bug due to lack of writeback flag waiting Greg Kroah-Hartman
` (38 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches,
will@kernel.org, mhklinux@outlook.com, petr.tesarik1@huawei-partners.com, nicolinc@nvidia.com, hch@lst.de, Fabio Estevam,
Petr Tesarik, Michael Kelley, Christoph Hellwig, Fabio Estevam
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Petr Tesarik <petr.tesarik1@huawei-partners.com>
commit af133562d5aff41fcdbe51f1a504ae04788b5fc0 upstream.
Allow a buffer pre-padding of up to alloc_align_mask, even if it requires
allocating additional IO TLB slots.
If the allocation alignment is bigger than IO_TLB_SIZE and min_align_mask
covers any non-zero bits in the original address between IO_TLB_SIZE and
alloc_align_mask, these bits are not preserved in the swiotlb buffer
address.
To fix this case, increase the allocation size and use a larger offset
within the allocated buffer. As a result, extra padding slots may be
allocated before the mapping start address.
Leave orig_addr in these padding slots initialized to INVALID_PHYS_ADDR.
These slots do not correspond to any CPU buffer, so attempts to sync the
data should be ignored.
The padding slots should be automatically released when the buffer is
unmapped. However, swiotlb_tbl_unmap_single() takes only the address of the
DMA buffer slot, not the first padding slot. Save the number of padding
slots in struct io_tlb_slot and use it to adjust the slot index in
swiotlb_release_slots(), so all allocated slots are properly freed.
Cc: stable@vger.kernel.org # v6.6+
Fixes: 2fd4fa5d3fb5 ("swiotlb: Fix alignment checks when both allocation and DMA masks are present")
Link: https://lore.kernel.org/linux-iommu/20240311210507.217daf8b@meshulam.tesarici.cz/
Signed-off-by: Petr Tesarik <petr.tesarik1@huawei-partners.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/dma/swiotlb.c | 59 +++++++++++++++++++++++++++++++++++++++------------
1 file changed, 46 insertions(+), 13 deletions(-)
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -69,11 +69,14 @@
* @alloc_size: Size of the allocated buffer.
* @list: The free list describing the number of free entries available
* from each index.
+ * @pad_slots: Number of preceding padding slots. Valid only in the first
+ * allocated non-padding slot.
*/
struct io_tlb_slot {
phys_addr_t orig_addr;
size_t alloc_size;
- unsigned int list;
+ unsigned short list;
+ unsigned short pad_slots;
};
static bool swiotlb_force_bounce;
@@ -287,6 +290,7 @@ static void swiotlb_init_io_tlb_pool(str
mem->nslabs - i);
mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
mem->slots[i].alloc_size = 0;
+ mem->slots[i].pad_slots = 0;
}
memset(vaddr, 0, bytes);
@@ -821,12 +825,30 @@ void swiotlb_dev_init(struct device *dev
#endif
}
-/*
- * Return the offset into a iotlb slot required to keep the device happy.
+/**
+ * swiotlb_align_offset() - Get required offset into an IO TLB allocation.
+ * @dev: Owning device.
+ * @align_mask: Allocation alignment mask.
+ * @addr: DMA address.
+ *
+ * Return the minimum offset from the start of an IO TLB allocation which is
+ * required for a given buffer address and allocation alignment to keep the
+ * device happy.
+ *
+ * First, the address bits covered by min_align_mask must be identical in the
+ * original address and the bounce buffer address. High bits are preserved by
+ * choosing a suitable IO TLB slot, but bits below IO_TLB_SHIFT require extra
+ * padding bytes before the bounce buffer.
+ *
+ * Second, @align_mask specifies which bits of the first allocated slot must
+ * be zero. This may require allocating additional padding slots, and then the
+ * offset (in bytes) from the first such padding slot is returned.
*/
-static unsigned int swiotlb_align_offset(struct device *dev, u64 addr)
+static unsigned int swiotlb_align_offset(struct device *dev,
+ unsigned int align_mask, u64 addr)
{
- return addr & dma_get_min_align_mask(dev) & (IO_TLB_SIZE - 1);
+ return addr & dma_get_min_align_mask(dev) &
+ (align_mask | (IO_TLB_SIZE - 1));
}
/*
@@ -847,7 +869,7 @@ static void swiotlb_bounce(struct device
return;
tlb_offset = tlb_addr & (IO_TLB_SIZE - 1);
- orig_addr_offset = swiotlb_align_offset(dev, orig_addr);
+ orig_addr_offset = swiotlb_align_offset(dev, 0, orig_addr);
if (tlb_offset < orig_addr_offset) {
dev_WARN_ONCE(dev, 1,
"Access before mapping start detected. orig offset %u, requested offset %u.\n",
@@ -983,7 +1005,7 @@ static int swiotlb_area_find_slots(struc
unsigned long max_slots = get_max_slots(boundary_mask);
unsigned int iotlb_align_mask = dma_get_min_align_mask(dev);
unsigned int nslots = nr_slots(alloc_size), stride;
- unsigned int offset = swiotlb_align_offset(dev, orig_addr);
+ unsigned int offset = swiotlb_align_offset(dev, 0, orig_addr);
unsigned int index, slots_checked, count = 0, i;
unsigned long flags;
unsigned int slot_base;
@@ -1282,11 +1304,12 @@ phys_addr_t swiotlb_tbl_map_single(struc
unsigned long attrs)
{
struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
- unsigned int offset = swiotlb_align_offset(dev, orig_addr);
+ unsigned int offset;
struct io_tlb_pool *pool;
unsigned int i;
int index;
phys_addr_t tlb_addr;
+ unsigned short pad_slots;
if (!mem || !mem->nslabs) {
dev_warn_ratelimited(dev,
@@ -1303,6 +1326,7 @@ phys_addr_t swiotlb_tbl_map_single(struc
return (phys_addr_t)DMA_MAPPING_ERROR;
}
+ offset = swiotlb_align_offset(dev, alloc_align_mask, orig_addr);
index = swiotlb_find_slots(dev, orig_addr,
alloc_size + offset, alloc_align_mask, &pool);
if (index == -1) {
@@ -1318,6 +1342,10 @@ phys_addr_t swiotlb_tbl_map_single(struc
* This is needed when we sync the memory. Then we sync the buffer if
* needed.
*/
+ pad_slots = offset >> IO_TLB_SHIFT;
+ offset &= (IO_TLB_SIZE - 1);
+ index += pad_slots;
+ pool->slots[index].pad_slots = pad_slots;
for (i = 0; i < nr_slots(alloc_size + offset); i++)
pool->slots[index + i].orig_addr = slot_addr(orig_addr, i);
tlb_addr = slot_addr(pool->start, index) + offset;
@@ -1336,13 +1364,17 @@ static void swiotlb_release_slots(struct
{
struct io_tlb_pool *mem = swiotlb_find_pool(dev, tlb_addr);
unsigned long flags;
- unsigned int offset = swiotlb_align_offset(dev, tlb_addr);
- int index = (tlb_addr - offset - mem->start) >> IO_TLB_SHIFT;
- int nslots = nr_slots(mem->slots[index].alloc_size + offset);
- int aindex = index / mem->area_nslabs;
- struct io_tlb_area *area = &mem->areas[aindex];
+ unsigned int offset = swiotlb_align_offset(dev, 0, tlb_addr);
+ int index, nslots, aindex;
+ struct io_tlb_area *area;
int count, i;
+ index = (tlb_addr - offset - mem->start) >> IO_TLB_SHIFT;
+ index -= mem->slots[index].pad_slots;
+ nslots = nr_slots(mem->slots[index].alloc_size + offset);
+ aindex = index / mem->area_nslabs;
+ area = &mem->areas[aindex];
+
/*
* Return the buffer to the free list by setting the corresponding
* entries to indicate the number of contiguous entries available.
@@ -1365,6 +1397,7 @@ static void swiotlb_release_slots(struct
mem->slots[i].list = ++count;
mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
mem->slots[i].alloc_size = 0;
+ mem->slots[i].pad_slots = 0;
}
/*
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 241/267] nilfs2: fix potential kernel bug due to lack of writeback flag waiting
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (239 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 240/267] swiotlb: extend buffer pre-padding to alloc_align_mask if necessary Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 242/267] tick/nohz_full: Dont abuse smp_call_function_single() in tick_setup_device() Greg Kroah-Hartman
` (37 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Ryusuke Konishi, Andrew Morton
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ryusuke Konishi <konishi.ryusuke@gmail.com>
commit a4ca369ca221bb7e06c725792ac107f0e48e82e7 upstream.
Destructive writes to a block device on which nilfs2 is mounted can cause
a kernel bug in the folio/page writeback start routine or writeback end
routine (__folio_start_writeback in the log below):
kernel BUG at mm/page-writeback.c:3070!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
...
RIP: 0010:__folio_start_writeback+0xbaa/0x10e0
Code: 25 ff 0f 00 00 0f 84 18 01 00 00 e8 40 ca c6 ff e9 17 f6 ff ff
e8 36 ca c6 ff 4c 89 f7 48 c7 c6 80 c0 12 84 e8 e7 b3 0f 00 90 <0f>
0b e8 1f ca c6 ff 4c 89 f7 48 c7 c6 a0 c6 12 84 e8 d0 b3 0f 00
...
Call Trace:
<TASK>
nilfs_segctor_do_construct+0x4654/0x69d0 [nilfs2]
nilfs_segctor_construct+0x181/0x6b0 [nilfs2]
nilfs_segctor_thread+0x548/0x11c0 [nilfs2]
kthread+0x2f0/0x390
ret_from_fork+0x4b/0x80
ret_from_fork_asm+0x1a/0x30
</TASK>
This is because when the log writer starts a writeback for segment summary
blocks or a super root block that use the backing device's page cache, it
does not wait for the ongoing folio/page writeback, resulting in an
inconsistent writeback state.
Fix this issue by waiting for ongoing writebacks when putting
folios/pages on the backing device into writeback state.
Link: https://lkml.kernel.org/r/20240530141556.4411-1-konishi.ryusuke@gmail.com
Fixes: 9ff05123e3bf ("nilfs2: segment constructor")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/nilfs2/segment.c | 3 +++
1 file changed, 3 insertions(+)
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -1694,6 +1694,7 @@ static void nilfs_segctor_prepare_write(
if (bh->b_page != bd_page) {
if (bd_page) {
lock_page(bd_page);
+ wait_on_page_writeback(bd_page);
clear_page_dirty_for_io(bd_page);
set_page_writeback(bd_page);
unlock_page(bd_page);
@@ -1707,6 +1708,7 @@ static void nilfs_segctor_prepare_write(
if (bh == segbuf->sb_super_root) {
if (bh->b_page != bd_page) {
lock_page(bd_page);
+ wait_on_page_writeback(bd_page);
clear_page_dirty_for_io(bd_page);
set_page_writeback(bd_page);
unlock_page(bd_page);
@@ -1723,6 +1725,7 @@ static void nilfs_segctor_prepare_write(
}
if (bd_page) {
lock_page(bd_page);
+ wait_on_page_writeback(bd_page);
clear_page_dirty_for_io(bd_page);
set_page_writeback(bd_page);
unlock_page(bd_page);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 242/267] tick/nohz_full: Dont abuse smp_call_function_single() in tick_setup_device()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (240 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 241/267] nilfs2: fix potential kernel bug due to lack of writeback flag waiting Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 243/267] mm/huge_memory: dont unpoison huge_zero_folio Greg Kroah-Hartman
` (36 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Oleg Nesterov, Thomas Gleixner
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oleg Nesterov <oleg@redhat.com>
commit 07c54cc5988f19c9642fd463c2dbdac7fc52f777 upstream.
After the recent commit 5097cbcb38e6 ("sched/isolation: Prevent boot crash
when the boot CPU is nohz_full") the kernel no longer crashes, but there is
another problem.
In this case tick_setup_device() calls tick_take_do_timer_from_boot() to
update tick_do_timer_cpu and this triggers the WARN_ON_ONCE(irqs_disabled)
in smp_call_function_single().
Kill tick_take_do_timer_from_boot() and just use WRITE_ONCE(), the new
comment explains why this is safe (thanks Thomas!).
Fixes: 08ae95f4fd3b ("nohz_full: Allow the boot CPU to be nohz_full")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240528122019.GA28794@redhat.com
Link: https://lore.kernel.org/all/20240522151742.GA10400@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/time/tick-common.c | 42 ++++++++++++++----------------------------
1 file changed, 14 insertions(+), 28 deletions(-)
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -179,26 +179,6 @@ void tick_setup_periodic(struct clock_ev
}
}
-#ifdef CONFIG_NO_HZ_FULL
-static void giveup_do_timer(void *info)
-{
- int cpu = *(unsigned int *)info;
-
- WARN_ON(tick_do_timer_cpu != smp_processor_id());
-
- tick_do_timer_cpu = cpu;
-}
-
-static void tick_take_do_timer_from_boot(void)
-{
- int cpu = smp_processor_id();
- int from = tick_do_timer_boot_cpu;
-
- if (from >= 0 && from != cpu)
- smp_call_function_single(from, giveup_do_timer, &cpu, 1);
-}
-#endif
-
/*
* Setup the tick device
*/
@@ -222,19 +202,25 @@ static void tick_setup_device(struct tic
tick_next_period = ktime_get();
#ifdef CONFIG_NO_HZ_FULL
/*
- * The boot CPU may be nohz_full, in which case set
- * tick_do_timer_boot_cpu so the first housekeeping
- * secondary that comes up will take do_timer from
- * us.
+ * The boot CPU may be nohz_full, in which case the
+ * first housekeeping secondary will take do_timer()
+ * from it.
*/
if (tick_nohz_full_cpu(cpu))
tick_do_timer_boot_cpu = cpu;
- } else if (tick_do_timer_boot_cpu != -1 &&
- !tick_nohz_full_cpu(cpu)) {
- tick_take_do_timer_from_boot();
+ } else if (tick_do_timer_boot_cpu != -1 && !tick_nohz_full_cpu(cpu)) {
tick_do_timer_boot_cpu = -1;
- WARN_ON(tick_do_timer_cpu != cpu);
+ /*
+ * The boot CPU will stay in periodic (NOHZ disabled)
+ * mode until clocksource_done_booting() called after
+ * smp_init() selects a high resolution clocksource and
+ * timekeeping_notify() kicks the NOHZ stuff alive.
+ *
+ * So this WRITE_ONCE can only race with the READ_ONCE
+ * check in tick_periodic() but this race is harmless.
+ */
+ WRITE_ONCE(tick_do_timer_cpu, cpu);
#endif
}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 243/267] mm/huge_memory: dont unpoison huge_zero_folio
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (241 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 242/267] tick/nohz_full: Dont abuse smp_call_function_single() in tick_setup_device() Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 244/267] serial: 8250_pxa: Configure tx_loadsz to match FIFO IRQ level Greg Kroah-Hartman
` (35 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Miaohe Lin, David Hildenbrand,
Yang Shi, Oscar Salvador, Anshuman Khandual, Naoya Horiguchi,
Xu Yu, Andrew Morton
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miaohe Lin <linmiaohe@huawei.com>
commit fe6f86f4b40855a130a19aa589f9ba7f650423f4 upstream.
When I did memory failure tests recently, below panic occurs:
kernel BUG at include/linux/mm.h:1135!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 9 PID: 137 Comm: kswapd1 Not tainted 6.9.0-rc4-00491-gd5ce28f156fe-dirty #14
RIP: 0010:shrink_huge_zero_page_scan+0x168/0x1a0
RSP: 0018:ffff9933c6c57bd0 EFLAGS: 00000246
RAX: 000000000000003e RBX: 0000000000000000 RCX: ffff88f61fc5c9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff88f61fc5c9c0
RBP: ffffcd7c446b0000 R08: ffffffff9a9405f0 R09: 0000000000005492
R10: 00000000000030ea R11: ffffffff9a9405f0 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: ffff88e703c4ac00
FS: 0000000000000000(0000) GS:ffff88f61fc40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055f4da6e9878 CR3: 0000000c71048000 CR4: 00000000000006f0
Call Trace:
<TASK>
do_shrink_slab+0x14f/0x6a0
shrink_slab+0xca/0x8c0
shrink_node+0x2d0/0x7d0
balance_pgdat+0x33a/0x720
kswapd+0x1f3/0x410
kthread+0xd5/0x100
ret_from_fork+0x2f/0x50
ret_from_fork_asm+0x1a/0x30
</TASK>
Modules linked in: mce_inject hwpoison_inject
---[ end trace 0000000000000000 ]---
RIP: 0010:shrink_huge_zero_page_scan+0x168/0x1a0
RSP: 0018:ffff9933c6c57bd0 EFLAGS: 00000246
RAX: 000000000000003e RBX: 0000000000000000 RCX: ffff88f61fc5c9c8
RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff88f61fc5c9c0
RBP: ffffcd7c446b0000 R08: ffffffff9a9405f0 R09: 0000000000005492
R10: 00000000000030ea R11: ffffffff9a9405f0 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: ffff88e703c4ac00
FS: 0000000000000000(0000) GS:ffff88f61fc40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055f4da6e9878 CR3: 0000000c71048000 CR4: 00000000000006f0
The root cause is that HWPoison flag will be set for huge_zero_folio
without increasing the folio refcnt. But then unpoison_memory() will
decrease the folio refcnt unexpectedly as it appears like a successfully
hwpoisoned folio leading to VM_BUG_ON_PAGE(page_ref_count(page) == 0) when
releasing huge_zero_folio.
Skip unpoisoning huge_zero_folio in unpoison_memory() to fix this issue.
We're not prepared to unpoison huge_zero_folio yet.
Link: https://lkml.kernel.org/r/20240516122608.22610-1-linmiaohe@huawei.com
Fixes: 478d134e9506 ("mm/huge_memory: do not overkill when splitting huge_zero_page")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Xu Yu <xuyu@linux.alibaba.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/memory-failure.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -2535,6 +2535,13 @@ int unpoison_memory(unsigned long pfn)
goto unlock_mutex;
}
+ if (is_huge_zero_page(&folio->page)) {
+ unpoison_pr_info("Unpoison: huge zero page is not supported %#lx\n",
+ pfn, &unpoison_rs);
+ ret = -EOPNOTSUPP;
+ goto unlock_mutex;
+ }
+
if (!PageHWPoison(p)) {
unpoison_pr_info("Unpoison: Page was already unpoisoned %#lx\n",
pfn, &unpoison_rs);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 244/267] serial: 8250_pxa: Configure tx_loadsz to match FIFO IRQ level
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (242 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 243/267] mm/huge_memory: dont unpoison huge_zero_folio Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 245/267] Revert "fork: defer linking file vma until vma is fully initialized" Greg Kroah-Hartman
` (34 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Doug Brown, stable
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Doug Brown <doug@schmorgal.com>
commit 5208e7ced520a813b4f4774451fbac4e517e78b2 upstream.
The FIFO is 64 bytes, but the FCR is configured to fire the TX interrupt
when the FIFO is half empty (bit 3 = 0). Thus, we should only write 32
bytes when a TX interrupt occurs.
This fixes a problem observed on the PXA168 that dropped a bunch of TX
bytes during large transmissions.
Fixes: ab28f51c77cd ("serial: rewrite pxa2xx-uart to use 8250_core")
Signed-off-by: Doug Brown <doug@schmorgal.com>
Link: https://lore.kernel.org/r/20240519191929.122202-1-doug@schmorgal.com
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/tty/serial/8250/8250_pxa.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/tty/serial/8250/8250_pxa.c
+++ b/drivers/tty/serial/8250/8250_pxa.c
@@ -124,6 +124,7 @@ static int serial_pxa_probe(struct platf
uart.port.regshift = 2;
uart.port.irq = irq;
uart.port.fifosize = 64;
+ uart.tx_loadsz = 32;
uart.port.flags = UPF_IOREMAP | UPF_SKIP_TEST | UPF_FIXED_TYPE;
uart.port.dev = &pdev->dev;
uart.port.uartclk = clk_get_rate(data->clk);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 245/267] Revert "fork: defer linking file vma until vma is fully initialized"
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (243 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 244/267] serial: 8250_pxa: Configure tx_loadsz to match FIFO IRQ level Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 246/267] selftests/net: add lib.sh Greg Kroah-Hartman
` (33 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Leah Rumancik, Sam James
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sam James <sam@gentoo.org>
This reverts commit cec11fa2eb512ebe3a459c185f4aca1d44059bbf which is commit
35e351780fa9d8240dd6f7e4f245f9ea37e96c19 upstream.
The backport is incomplete and causes xfstests failures. The consequences
of the incomplete backport seem worse than the original issue, so pick
the lesser evil and revert until a full backport is ready.
Link: https://lore.kernel.org/stable/20240604004751.3883227-1-leah.rumancik@gmail.com/
Reported-by: Leah Rumancik <leah.rumancik@gmail.com>
Signed-off-by: Sam James <sam@gentoo.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/fork.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -727,15 +727,6 @@ static __latent_entropy int dup_mmap(str
} else if (anon_vma_fork(tmp, mpnt))
goto fail_nomem_anon_vma_fork;
vm_flags_clear(tmp, VM_LOCKED_MASK);
- /*
- * Copy/update hugetlb private vma information.
- */
- if (is_vm_hugetlb_page(tmp))
- hugetlb_dup_vma_private(tmp);
-
- if (tmp->vm_ops && tmp->vm_ops->open)
- tmp->vm_ops->open(tmp);
-
file = tmp->vm_file;
if (file) {
struct address_space *mapping = file->f_mapping;
@@ -752,6 +743,12 @@ static __latent_entropy int dup_mmap(str
i_mmap_unlock_write(mapping);
}
+ /*
+ * Copy/update hugetlb private vma information.
+ */
+ if (is_vm_hugetlb_page(tmp))
+ hugetlb_dup_vma_private(tmp);
+
/* Link the vma into the MT */
if (vma_iter_bulk_store(&vmi, tmp))
goto fail_nomem_vmi_store;
@@ -760,6 +757,9 @@ static __latent_entropy int dup_mmap(str
if (!(tmp->vm_flags & VM_WIPEONFORK))
retval = copy_page_range(tmp, mpnt);
+ if (tmp->vm_ops && tmp->vm_ops->open)
+ tmp->vm_ops->open(tmp);
+
if (retval)
goto loop_out;
}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 246/267] selftests/net: add lib.sh
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (244 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 245/267] Revert "fork: defer linking file vma until vma is fully initialized" Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 247/267] selftests/net: add variable NS_LIST for lib.sh Greg Kroah-Hartman
` (32 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Petr Machata, Hangbin Liu,
Paolo Abeni, Po-Hsu Lin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hangbin Liu <liuhangbin@gmail.com>
commit 25ae948b447881bf689d459cd5bd4629d9c04b20 upstream.
Add a lib.sh for net selftests. This file can be used to define commonly
used variables and functions. Some commonly used functions can be moved
from forwarding/lib.sh to this lib file. e.g. busywait().
Add function setup_ns() for user to create unique namespaces with given
prefix name.
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
[PHLin: add lib.sh to TEST_FILES directly as we already have upstream
commit 06efafd8 landed in 6.6.y]
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
tools/testing/selftests/net/Makefile | 2
tools/testing/selftests/net/forwarding/lib.sh | 27 --------
tools/testing/selftests/net/lib.sh | 85 ++++++++++++++++++++++++++
3 files changed, 87 insertions(+), 27 deletions(-)
create mode 100644 tools/testing/selftests/net/lib.sh
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -92,7 +92,7 @@ TEST_PROGS += test_vxlan_nolocalbypass.s
TEST_PROGS += test_bridge_backup_port.sh
TEST_FILES := settings
-TEST_FILES += in_netns.sh net_helper.sh setup_loopback.sh setup_veth.sh
+TEST_FILES += in_netns.sh lib.sh net_helper.sh setup_loopback.sh setup_veth.sh
include ../lib.mk
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -4,9 +4,6 @@
##############################################################################
# Defines
-# Kselftest framework requirement - SKIP code is 4.
-ksft_skip=4
-
# Can be overridden by the configuration file.
PING=${PING:=ping}
PING6=${PING6:=ping6}
@@ -41,6 +38,7 @@ if [[ -f $relative_path/forwarding.confi
source "$relative_path/forwarding.config"
fi
+source ../lib.sh
##############################################################################
# Sanity checks
@@ -395,29 +393,6 @@ log_info()
echo "INFO: $msg"
}
-busywait()
-{
- local timeout=$1; shift
-
- local start_time="$(date -u +%s%3N)"
- while true
- do
- local out
- out=$("$@")
- local ret=$?
- if ((!ret)); then
- echo -n "$out"
- return 0
- fi
-
- local current_time="$(date -u +%s%3N)"
- if ((current_time - start_time > timeout)); then
- echo -n "$out"
- return 1
- fi
- done
-}
-
not()
{
"$@"
--- /dev/null
+++ b/tools/testing/selftests/net/lib.sh
@@ -0,0 +1,85 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+##############################################################################
+# Defines
+
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+
+##############################################################################
+# Helpers
+busywait()
+{
+ local timeout=$1; shift
+
+ local start_time="$(date -u +%s%3N)"
+ while true
+ do
+ local out
+ out=$("$@")
+ local ret=$?
+ if ((!ret)); then
+ echo -n "$out"
+ return 0
+ fi
+
+ local current_time="$(date -u +%s%3N)"
+ if ((current_time - start_time > timeout)); then
+ echo -n "$out"
+ return 1
+ fi
+ done
+}
+
+cleanup_ns()
+{
+ local ns=""
+ local errexit=0
+ local ret=0
+
+ # disable errexit temporary
+ if [[ $- =~ "e" ]]; then
+ errexit=1
+ set +e
+ fi
+
+ for ns in "$@"; do
+ ip netns delete "${ns}" &> /dev/null
+ if ! busywait 2 ip netns list \| grep -vq "^$ns$" &> /dev/null; then
+ echo "Warn: Failed to remove namespace $ns"
+ ret=1
+ fi
+ done
+
+ [ $errexit -eq 1 ] && set -e
+ return $ret
+}
+
+# setup netns with given names as prefix. e.g
+# setup_ns local remote
+setup_ns()
+{
+ local ns=""
+ local ns_name=""
+ local ns_list=""
+ for ns_name in "$@"; do
+ # Some test may setup/remove same netns multi times
+ if unset ${ns_name} 2> /dev/null; then
+ ns="${ns_name,,}-$(mktemp -u XXXXXX)"
+ eval readonly ${ns_name}="$ns"
+ else
+ eval ns='$'${ns_name}
+ cleanup_ns "$ns"
+
+ fi
+
+ if ! ip netns add "$ns"; then
+ echo "Failed to create namespace $ns_name"
+ cleanup_ns "$ns_list"
+ return $ksft_skip
+ fi
+ ip -n "$ns" link set lo up
+ ns_list="$ns_list $ns"
+ done
+}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 247/267] selftests/net: add variable NS_LIST for lib.sh
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (245 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 246/267] selftests/net: add lib.sh Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 248/267] selftests: forwarding: Avoid failures to source net/lib.sh Greg Kroah-Hartman
` (31 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Hangbin Liu, Jakub Kicinski,
Po-Hsu Lin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hangbin Liu <liuhangbin@gmail.com>
commit b6925b4ed57cccf42ca0fb46c7446f0859e7ad4b upstream.
Add a global variable NS_LIST to store all the namespaces that setup_ns
created, so the caller could call cleanup_all_ns() instead of remember
all the netns names when using cleanup_ns().
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20231213060856.4030084-2-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
tools/testing/selftests/net/lib.sh | 8 ++++++++
1 file changed, 8 insertions(+)
--- a/tools/testing/selftests/net/lib.sh
+++ b/tools/testing/selftests/net/lib.sh
@@ -6,6 +6,8 @@
# Kselftest framework requirement - SKIP code is 4.
ksft_skip=4
+# namespace list created by setup_ns
+NS_LIST=""
##############################################################################
# Helpers
@@ -56,6 +58,11 @@ cleanup_ns()
return $ret
}
+cleanup_all_ns()
+{
+ cleanup_ns $NS_LIST
+}
+
# setup netns with given names as prefix. e.g
# setup_ns local remote
setup_ns()
@@ -82,4 +89,5 @@ setup_ns()
ip -n "$ns" link set lo up
ns_list="$ns_list $ns"
done
+ NS_LIST="$NS_LIST $ns_list"
}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 248/267] selftests: forwarding: Avoid failures to source net/lib.sh
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (246 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 247/267] selftests/net: add variable NS_LIST for lib.sh Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 249/267] remoteproc: k3-r5: Jump to error handling labels in start/stop errors Greg Kroah-Hartman
` (30 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Ido Schimmel, Petr Machata,
Benjamin Poirier, Hangbin Liu, Jakub Kicinski, Po-Hsu Lin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Benjamin Poirier <bpoirier@nvidia.com>
commit 2114e83381d3289a88378850f43069e79f848083 upstream.
The expression "source ../lib.sh" added to net/forwarding/lib.sh in commit
25ae948b4478 ("selftests/net: add lib.sh") does not work for tests outside
net/forwarding which source net/forwarding/lib.sh (1). It also does not
work in some cases where only a subset of tests are exported (2).
Avoid the problems mentioned above by replacing the faulty expression with
a copy of the content from net/lib.sh which is used by files under
net/forwarding.
A more thorough solution which avoids duplicating content between
net/lib.sh and net/forwarding/lib.sh has been posted here:
https://lore.kernel.org/netdev/20231222135836.992841-1-bpoirier@nvidia.com/
The approach in the current patch is a stopgap solution to avoid submitting
large changes at the eleventh hour of this development cycle.
Example of problem 1)
tools/testing/selftests/drivers/net/bonding$ ./dev_addr_lists.sh
./net_forwarding_lib.sh: line 41: ../lib.sh: No such file or directory
TEST: bonding cleanup mode active-backup [ OK ]
TEST: bonding cleanup mode 802.3ad [ OK ]
TEST: bonding LACPDU multicast address to slave (from bond down) [ OK ]
TEST: bonding LACPDU multicast address to slave (from bond up) [ OK ]
An error message is printed but since the test does not use functions from
net/lib.sh, the test results are not affected.
Example of problem 2)
tools/testing/selftests$ make install TARGETS="net/forwarding"
tools/testing/selftests$ cd kselftest_install/net/forwarding/
tools/testing/selftests/kselftest_install/net/forwarding$ ./pedit_ip.sh veth{0..3}
lib.sh: line 41: ../lib.sh: No such file or directory
TEST: ping [ OK ]
TEST: ping6 [ OK ]
./pedit_ip.sh: line 135: busywait: command not found
TEST: dev veth1 ingress pedit ip src set 198.51.100.1 [FAIL]
Expected to get 10 packets, but got .
./pedit_ip.sh: line 135: busywait: command not found
TEST: dev veth2 egress pedit ip src set 198.51.100.1 [FAIL]
Expected to get 10 packets, but got .
./pedit_ip.sh: line 135: busywait: command not found
TEST: dev veth1 ingress pedit ip dst set 198.51.100.1 [FAIL]
Expected to get 10 packets, but got .
./pedit_ip.sh: line 135: busywait: command not found
TEST: dev veth2 egress pedit ip dst set 198.51.100.1 [FAIL]
Expected to get 10 packets, but got .
./pedit_ip.sh: line 135: busywait: command not found
TEST: dev veth1 ingress pedit ip6 src set 2001:db8:2::1 [FAIL]
Expected to get 10 packets, but got .
./pedit_ip.sh: line 135: busywait: command not found
TEST: dev veth2 egress pedit ip6 src set 2001:db8:2::1 [FAIL]
Expected to get 10 packets, but got .
./pedit_ip.sh: line 135: busywait: command not found
TEST: dev veth1 ingress pedit ip6 dst set 2001:db8:2::1 [FAIL]
Expected to get 10 packets, but got .
./pedit_ip.sh: line 135: busywait: command not found
TEST: dev veth2 egress pedit ip6 dst set 2001:db8:2::1 [FAIL]
Expected to get 10 packets, but got .
In this case, the test results are affected.
Fixes: 25ae948b4478 ("selftests/net: add lib.sh")
Suggested-by: Ido Schimmel <idosch@nvidia.com>
Suggested-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240104141109.100672-1-bpoirier@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
tools/testing/selftests/net/forwarding/lib.sh | 27 +++++++++++++++++++++++++-
1 file changed, 26 insertions(+), 1 deletion(-)
--- a/tools/testing/selftests/net/forwarding/lib.sh
+++ b/tools/testing/selftests/net/forwarding/lib.sh
@@ -38,7 +38,32 @@ if [[ -f $relative_path/forwarding.confi
source "$relative_path/forwarding.config"
fi
-source ../lib.sh
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+
+busywait()
+{
+ local timeout=$1; shift
+
+ local start_time="$(date -u +%s%3N)"
+ while true
+ do
+ local out
+ out=$("$@")
+ local ret=$?
+ if ((!ret)); then
+ echo -n "$out"
+ return 0
+ fi
+
+ local current_time="$(date -u +%s%3N)"
+ if ((current_time - start_time > timeout)); then
+ echo -n "$out"
+ return 1
+ fi
+ done
+}
+
##############################################################################
# Sanity checks
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 249/267] remoteproc: k3-r5: Jump to error handling labels in start/stop errors
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (247 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 248/267] selftests: forwarding: Avoid failures to source net/lib.sh Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 250/267] cachefiles, erofs: Fix NULL deref in when cachefiles is not doing ondemand-mode Greg Kroah-Hartman
` (29 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Beleswar Padhi, Mathieu Poirier
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Beleswar Padhi <b-padhi@ti.com>
commit 1dc7242f6ee0c99852cb90676d7fe201cf5de422 upstream.
In case of errors during core start operation from sysfs, the driver
directly returns with the -EPERM error code. Fix this to ensure that
mailbox channels are freed on error before returning by jumping to the
'put_mbox' error handling label. Similarly, jump to the 'out' error
handling label to return with required -EPERM error code during the
core stop operation from sysfs.
Fixes: 3c8a9066d584 ("remoteproc: k3-r5: Do not allow core1 to power up before core0 via sysfs")
Signed-off-by: Beleswar Padhi <b-padhi@ti.com>
Link: https://lore.kernel.org/r/20240506141849.1735679-1-b-padhi@ti.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/remoteproc/ti_k3_r5_remoteproc.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/remoteproc/ti_k3_r5_remoteproc.c
+++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c
@@ -580,7 +580,8 @@ static int k3_r5_rproc_start(struct rpro
if (core != core0 && core0->rproc->state == RPROC_OFFLINE) {
dev_err(dev, "%s: can not start core 1 before core 0\n",
__func__);
- return -EPERM;
+ ret = -EPERM;
+ goto put_mbox;
}
ret = k3_r5_core_run(core);
@@ -648,7 +649,8 @@ static int k3_r5_rproc_stop(struct rproc
if (core != core1 && core1->rproc->state != RPROC_OFFLINE) {
dev_err(dev, "%s: can not stop core 0 before core 1\n",
__func__);
- return -EPERM;
+ ret = -EPERM;
+ goto out;
}
ret = k3_r5_core_halt(core);
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 250/267] cachefiles, erofs: Fix NULL deref in when cachefiles is not doing ondemand-mode
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (248 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 249/267] remoteproc: k3-r5: Jump to error handling labels in start/stop errors Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 251/267] selftests/net/lib: update busywait timeout value Greg Kroah-Hartman
` (28 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Marc Dionne, David Howells,
Gao Xiang, Chao Yu, Yue Hu, Jeffle Xu, linux-erofs, netfs,
linux-fsdevel
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Howells <dhowells@redhat.com>
commit c3d6569a43322f371e7ba0ad386112723757ac8f upstream.
cachefiles_ondemand_init_object() as called from cachefiles_open_file() and
cachefiles_create_tmpfile() does not check if object->ondemand is set
before dereferencing it, leading to an oops something like:
RIP: 0010:cachefiles_ondemand_init_object+0x9/0x41
...
Call Trace:
<TASK>
cachefiles_open_file+0xc9/0x187
cachefiles_lookup_cookie+0x122/0x2be
fscache_cookie_state_machine+0xbe/0x32b
fscache_cookie_worker+0x1f/0x2d
process_one_work+0x136/0x208
process_scheduled_works+0x3a/0x41
worker_thread+0x1a2/0x1f6
kthread+0xca/0xd2
ret_from_fork+0x21/0x33
Fix this by making cachefiles_ondemand_init_object() return immediately if
cachefiles->ondemand is NULL.
Fixes: 3c5ecfe16e76 ("cachefiles: extract ondemand info field from cachefiles_object")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Gao Xiang <xiang@kernel.org>
cc: Chao Yu <chao@kernel.org>
cc: Yue Hu <huyue2@coolpad.com>
cc: Jeffle Xu <jefflexu@linux.alibaba.com>
cc: linux-erofs@lists.ozlabs.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/cachefiles/ondemand.c | 3 +++
1 file changed, 3 insertions(+)
--- a/fs/cachefiles/ondemand.c
+++ b/fs/cachefiles/ondemand.c
@@ -611,6 +611,9 @@ int cachefiles_ondemand_init_object(stru
struct fscache_volume *volume = object->volume->vcookie;
size_t volume_key_size, cookie_key_size, data_len;
+ if (!object->ondemand)
+ return 0;
+
/*
* CacheFiles will firstly check the cache file under the root cache
* directory. If the coherency check failed, it will fallback to
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 251/267] selftests/net/lib: update busywait timeout value
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (249 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 250/267] cachefiles, erofs: Fix NULL deref in when cachefiles is not doing ondemand-mode Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 252/267] selftests/net/lib: no need to record ns name if it already exist Greg Kroah-Hartman
` (27 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Hangbin Liu, Simon Horman,
Jakub Kicinski
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hangbin Liu <liuhangbin@gmail.com>
commit fc836129f708407502632107e58d48f54b1caf75 upstream.
The busywait timeout value is a millisecond, not a second. So the
current setting 2 is too small. On slow/busy host (or VMs) the
current timeout can expire even on "correct" execution, causing random
failures. Let's copy the WAIT_TIMEOUT from forwarding/lib.sh and set
BUSYWAIT_TIMEOUT here.
Fixes: 25ae948b4478 ("selftests/net: add lib.sh")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240124061344.1864484-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
tools/testing/selftests/net/lib.sh | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/tools/testing/selftests/net/lib.sh
+++ b/tools/testing/selftests/net/lib.sh
@@ -4,6 +4,9 @@
##############################################################################
# Defines
+WAIT_TIMEOUT=${WAIT_TIMEOUT:=20}
+BUSYWAIT_TIMEOUT=$((WAIT_TIMEOUT * 1000)) # ms
+
# Kselftest framework requirement - SKIP code is 4.
ksft_skip=4
# namespace list created by setup_ns
@@ -48,7 +51,7 @@ cleanup_ns()
for ns in "$@"; do
ip netns delete "${ns}" &> /dev/null
- if ! busywait 2 ip netns list \| grep -vq "^$ns$" &> /dev/null; then
+ if ! busywait $BUSYWAIT_TIMEOUT ip netns list \| grep -vq "^$ns$" &> /dev/null; then
echo "Warn: Failed to remove namespace $ns"
ret=1
fi
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 252/267] selftests/net/lib: no need to record ns name if it already exist
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (250 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 251/267] selftests/net/lib: update busywait timeout value Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 253/267] selftests: net: lib: support errexit with busywait Greg Kroah-Hartman
` (26 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Hangbin Liu, Simon Horman,
David S. Miller
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hangbin Liu <liuhangbin@gmail.com>
commit 83e93942796db58652288f0391ac00072401816f upstream.
There is no need to add the name to ns_list again if the netns already
recoreded.
Fixes: 25ae948b4478 ("selftests/net: add lib.sh")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
tools/testing/selftests/net/lib.sh | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/tools/testing/selftests/net/lib.sh
+++ b/tools/testing/selftests/net/lib.sh
@@ -73,15 +73,17 @@ setup_ns()
local ns=""
local ns_name=""
local ns_list=""
+ local ns_exist=
for ns_name in "$@"; do
# Some test may setup/remove same netns multi times
if unset ${ns_name} 2> /dev/null; then
ns="${ns_name,,}-$(mktemp -u XXXXXX)"
eval readonly ${ns_name}="$ns"
+ ns_exist=false
else
eval ns='$'${ns_name}
cleanup_ns "$ns"
-
+ ns_exist=true
fi
if ! ip netns add "$ns"; then
@@ -90,7 +92,7 @@ setup_ns()
return $ksft_skip
fi
ip -n "$ns" link set lo up
- ns_list="$ns_list $ns"
+ ! $ns_exist && ns_list="$ns_list $ns"
done
NS_LIST="$NS_LIST $ns_list"
}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 253/267] selftests: net: lib: support errexit with busywait
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (251 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 252/267] selftests/net/lib: no need to record ns name if it already exist Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 254/267] selftests: net: lib: avoid error removing empty netns name Greg Kroah-Hartman
` (25 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Geliang Tang, Matthieu Baerts (NGI0),
Hangbin Liu, Jakub Kicinski
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthieu Baerts (NGI0) <matttbe@kernel.org>
commit 41b02ea4c0adfcc6761fbfed42c3ce6b6412d881 upstream.
If errexit is enabled ('set -e'), loopy_wait -- or busywait and others
using it -- will stop after the first failure.
Note that if the returned status of loopy_wait is checked, and even if
errexit is enabled, Bash will not stop at the first error.
Fixes: 25ae948b4478 ("selftests/net: add lib.sh")
Cc: stable@vger.kernel.org
Acked-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-1-b3afadd368c9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
tools/testing/selftests/net/lib.sh | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
--- a/tools/testing/selftests/net/lib.sh
+++ b/tools/testing/selftests/net/lib.sh
@@ -22,9 +22,7 @@ busywait()
while true
do
local out
- out=$("$@")
- local ret=$?
- if ((!ret)); then
+ if out=$("$@"); then
echo -n "$out"
return 0
fi
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 254/267] selftests: net: lib: avoid error removing empty netns name
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (252 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 253/267] selftests: net: lib: support errexit with busywait Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 255/267] greybus: Fix use-after-free bug in gb_interface_release due to race condition Greg Kroah-Hartman
` (24 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Geliang Tang, Matthieu Baerts (NGI0),
Petr Machata, Hangbin Liu, Jakub Kicinski
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthieu Baerts (NGI0) <matttbe@kernel.org>
commit 79322174bcc780b99795cb89d237b26006a8b94b upstream.
If there is an error to create the first netns with 'setup_ns()',
'cleanup_ns()' will be called with an empty string as first parameter.
The consequences is that 'cleanup_ns()' will try to delete an invalid
netns, and wait 20 seconds if the netns list is empty.
Instead of just checking if the name is not empty, convert the string
separated by spaces to an array. Manipulating the array is cleaner, and
calling 'cleanup_ns()' with an empty array will be a no-op.
Fixes: 25ae948b4478 ("selftests/net: add lib.sh")
Cc: stable@vger.kernel.org
Acked-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-2-b3afadd368c9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
tools/testing/selftests/net/lib.sh | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
--- a/tools/testing/selftests/net/lib.sh
+++ b/tools/testing/selftests/net/lib.sh
@@ -10,7 +10,7 @@ BUSYWAIT_TIMEOUT=$((WAIT_TIMEOUT * 1000)
# Kselftest framework requirement - SKIP code is 4.
ksft_skip=4
# namespace list created by setup_ns
-NS_LIST=""
+NS_LIST=()
##############################################################################
# Helpers
@@ -48,6 +48,7 @@ cleanup_ns()
fi
for ns in "$@"; do
+ [ -z "${ns}" ] && continue
ip netns delete "${ns}" &> /dev/null
if ! busywait $BUSYWAIT_TIMEOUT ip netns list \| grep -vq "^$ns$" &> /dev/null; then
echo "Warn: Failed to remove namespace $ns"
@@ -61,7 +62,7 @@ cleanup_ns()
cleanup_all_ns()
{
- cleanup_ns $NS_LIST
+ cleanup_ns "${NS_LIST[@]}"
}
# setup netns with given names as prefix. e.g
@@ -70,7 +71,7 @@ setup_ns()
{
local ns=""
local ns_name=""
- local ns_list=""
+ local ns_list=()
local ns_exist=
for ns_name in "$@"; do
# Some test may setup/remove same netns multi times
@@ -86,11 +87,11 @@ setup_ns()
if ! ip netns add "$ns"; then
echo "Failed to create namespace $ns_name"
- cleanup_ns "$ns_list"
+ cleanup_ns "${ns_list[@]}"
return $ksft_skip
fi
ip -n "$ns" link set lo up
- ! $ns_exist && ns_list="$ns_list $ns"
+ ! $ns_exist && ns_list+=("$ns")
done
- NS_LIST="$NS_LIST $ns_list"
+ NS_LIST+=("${ns_list[@]}")
}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 255/267] greybus: Fix use-after-free bug in gb_interface_release due to race condition.
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (253 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 254/267] selftests: net: lib: avoid error removing empty netns name Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 256/267] ima: Fix use-after-free on a dentrys dname.name Greg Kroah-Hartman
` (23 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Sicong Huang, Ronnie Sahlberg
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sicong Huang <congei42@163.com>
commit 5c9c5d7f26acc2c669c1dcf57d1bb43ee99220ce upstream.
In gb_interface_create, &intf->mode_switch_completion is bound with
gb_interface_mode_switch_work. Then it will be started by
gb_interface_request_mode_switch. Here is the relevant code.
if (!queue_work(system_long_wq, &intf->mode_switch_work)) {
...
}
If we call gb_interface_release to make cleanup, there may be an
unfinished work. This function will call kfree to free the object
"intf". However, if gb_interface_mode_switch_work is scheduled to
run after kfree, it may cause use-after-free error as
gb_interface_mode_switch_work will use the object "intf".
The possible execution flow that may lead to the issue is as follows:
CPU0 CPU1
| gb_interface_create
| gb_interface_request_mode_switch
gb_interface_release |
kfree(intf) (free) |
| gb_interface_mode_switch_work
| mutex_lock(&intf->mutex) (use)
Fix it by canceling the work before kfree.
Signed-off-by: Sicong Huang <congei42@163.com>
Link: https://lore.kernel.org/r/20240416080313.92306-1-congei42@163.com
Cc: Ronnie Sahlberg <rsahlberg@ciq.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/greybus/interface.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/greybus/interface.c
+++ b/drivers/greybus/interface.c
@@ -694,6 +694,7 @@ static void gb_interface_release(struct
trace_gb_interface_release(intf);
+ cancel_work_sync(&intf->mode_switch_work);
kfree(intf);
}
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 256/267] ima: Fix use-after-free on a dentrys dname.name
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (254 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 255/267] greybus: Fix use-after-free bug in gb_interface_release due to race condition Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 257/267] device property: Implement device_is_big_endian() Greg Kroah-Hartman
` (22 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Al Viro, Stefan Berger, Mimi Zohar
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefan Berger <stefanb@linux.ibm.com>
commit be84f32bb2c981ca670922e047cdde1488b233de upstream.
->d_name.name can change on rename and the earlier value can be freed;
there are conditions sufficient to stabilize it (->d_lock on dentry,
->d_lock on its parent, ->i_rwsem exclusive on the parent's inode,
rename_lock), but none of those are met at any of the sites. Take a stable
snapshot of the name instead.
Link: https://lore.kernel.org/all/20240202182732.GE2087318@ZenIV/
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
security/integrity/ima/ima_api.c | 16 ++++++++++++----
security/integrity/ima/ima_template_lib.c | 17 ++++++++++++++---
2 files changed, 26 insertions(+), 7 deletions(-)
--- a/security/integrity/ima/ima_api.c
+++ b/security/integrity/ima/ima_api.c
@@ -244,8 +244,8 @@ int ima_collect_measurement(struct integ
const char *audit_cause = "failed";
struct inode *inode = file_inode(file);
struct inode *real_inode = d_real_inode(file_dentry(file));
- const char *filename = file->f_path.dentry->d_name.name;
struct ima_max_digest_data hash;
+ struct name_snapshot filename;
struct kstat stat;
int result = 0;
int length;
@@ -316,9 +316,13 @@ out:
if (file->f_flags & O_DIRECT)
audit_cause = "failed(directio)";
+ take_dentry_name_snapshot(&filename, file->f_path.dentry);
+
integrity_audit_msg(AUDIT_INTEGRITY_DATA, inode,
- filename, "collect_data", audit_cause,
- result, 0);
+ filename.name.name, "collect_data",
+ audit_cause, result, 0);
+
+ release_dentry_name_snapshot(&filename);
}
return result;
}
@@ -431,6 +435,7 @@ out:
*/
const char *ima_d_path(const struct path *path, char **pathbuf, char *namebuf)
{
+ struct name_snapshot filename;
char *pathname = NULL;
*pathbuf = __getname();
@@ -444,7 +449,10 @@ const char *ima_d_path(const struct path
}
if (!pathname) {
- strscpy(namebuf, path->dentry->d_name.name, NAME_MAX);
+ take_dentry_name_snapshot(&filename, path->dentry);
+ strscpy(namebuf, filename.name.name, NAME_MAX);
+ release_dentry_name_snapshot(&filename);
+
pathname = namebuf;
}
--- a/security/integrity/ima/ima_template_lib.c
+++ b/security/integrity/ima/ima_template_lib.c
@@ -483,7 +483,10 @@ static int ima_eventname_init_common(str
bool size_limit)
{
const char *cur_filename = NULL;
+ struct name_snapshot filename;
u32 cur_filename_len = 0;
+ bool snapshot = false;
+ int ret;
BUG_ON(event_data->filename == NULL && event_data->file == NULL);
@@ -496,7 +499,10 @@ static int ima_eventname_init_common(str
}
if (event_data->file) {
- cur_filename = event_data->file->f_path.dentry->d_name.name;
+ take_dentry_name_snapshot(&filename,
+ event_data->file->f_path.dentry);
+ snapshot = true;
+ cur_filename = filename.name.name;
cur_filename_len = strlen(cur_filename);
} else
/*
@@ -505,8 +511,13 @@ static int ima_eventname_init_common(str
*/
cur_filename_len = IMA_EVENT_NAME_LEN_MAX;
out:
- return ima_write_template_field_data(cur_filename, cur_filename_len,
- DATA_FMT_STRING, field_data);
+ ret = ima_write_template_field_data(cur_filename, cur_filename_len,
+ DATA_FMT_STRING, field_data);
+
+ if (snapshot)
+ release_dentry_name_snapshot(&filename);
+
+ return ret;
}
/*
^ permalink raw reply [flat|nested] 282+ messages in thread* [PATCH 6.6 257/267] device property: Implement device_is_big_endian()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (255 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 256/267] ima: Fix use-after-free on a dentrys dname.name Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 258/267] serial: core: Add UPIO_UNKNOWN constant for unknown port type Greg Kroah-Hartman
` (21 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Andy Shevchenko, Linus Walleij,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[ Upstream commit 826a5d8c9df9605fb4fdefa45432f95580241a1f ]
Some users want to use the struct device pointer to see if the
device is big endian in terms of Open Firmware specifications,
i.e. if it has a "big-endian" property, or if the kernel was
compiled for BE *and* the device has a "native-endian" property.
Provide inline helper for the users.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20231025184259.250588-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stable-dep-of: 87d80bfbd577 ("serial: 8250_dw: Don't use struct dw8250_data outside of 8250_dw")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
include/linux/property.h | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/include/linux/property.h b/include/linux/property.h
index 8c3c6685a2ae3..1684fca930f72 100644
--- a/include/linux/property.h
+++ b/include/linux/property.h
@@ -79,12 +79,38 @@ int fwnode_property_match_string(const struct fwnode_handle *fwnode,
bool fwnode_device_is_available(const struct fwnode_handle *fwnode);
+static inline bool fwnode_device_is_big_endian(const struct fwnode_handle *fwnode)
+{
+ if (fwnode_property_present(fwnode, "big-endian"))
+ return true;
+ if (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN) &&
+ fwnode_property_present(fwnode, "native-endian"))
+ return true;
+ return false;
+}
+
static inline
bool fwnode_device_is_compatible(const struct fwnode_handle *fwnode, const char *compat)
{
return fwnode_property_match_string(fwnode, "compatible", compat) >= 0;
}
+/**
+ * device_is_big_endian - check if a device has BE registers
+ * @dev: Pointer to the struct device
+ *
+ * Returns: true if the device has a "big-endian" property, or if the kernel
+ * was compiled for BE *and* the device has a "native-endian" property.
+ * Returns false otherwise.
+ *
+ * Callers would nominally use ioread32be/iowrite32be if
+ * device_is_big_endian() == true, or readl/writel otherwise.
+ */
+static inline bool device_is_big_endian(const struct device *dev)
+{
+ return fwnode_device_is_big_endian(dev_fwnode(dev));
+}
+
/**
* device_is_compatible - match 'compatible' property of the device with a given string
* @dev: Pointer to the struct device
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 258/267] serial: core: Add UPIO_UNKNOWN constant for unknown port type
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (256 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 257/267] device property: Implement device_is_big_endian() Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 259/267] serial: port: Introduce a common helper to read properties Greg Kroah-Hartman
` (20 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Andy Shevchenko, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[ Upstream commit 79d713baf63c8f23cc58b304c40be33d64a12aaf ]
In some APIs we would like to assign the special value to iotype
and compare against it in another places. Introduce UPIO_UNKNOWN
for this purpose.
Note, we can't use 0, because it's a valid value for IO port access.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20240304123035.758700-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stable-dep-of: 87d80bfbd577 ("serial: 8250_dw: Don't use struct dw8250_data outside of 8250_dw")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
include/linux/serial_core.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h
index a7d5fa892be26..412de73547521 100644
--- a/include/linux/serial_core.h
+++ b/include/linux/serial_core.h
@@ -470,6 +470,7 @@ struct uart_port {
unsigned char iotype; /* io access style */
unsigned char quirks; /* internal quirks */
+#define UPIO_UNKNOWN ((unsigned char)~0U) /* UCHAR_MAX */
#define UPIO_PORT (SERIAL_IO_PORT) /* 8b I/O port access */
#define UPIO_HUB6 (SERIAL_IO_HUB6) /* Hub6 ISA card */
#define UPIO_MEM (SERIAL_IO_MEM) /* driver-specific */
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 259/267] serial: port: Introduce a common helper to read properties
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (257 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 258/267] serial: core: Add UPIO_UNKNOWN constant for unknown port type Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 260/267] serial: 8250_dw: Switch to use uart_read_port_properties() Greg Kroah-Hartman
` (19 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Andy Shevchenko, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[ Upstream commit e894b6005dce0ed621b2788d6a249708fb6f95f9 ]
Several serial drivers want to read the same or similar set of
the port properties. Make a common helper for them.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20240304123035.758700-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stable-dep-of: 87d80bfbd577 ("serial: 8250_dw: Don't use struct dw8250_data outside of 8250_dw")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/tty/serial/serial_port.c | 145 +++++++++++++++++++++++++++++++
include/linux/serial_core.h | 2 +
2 files changed, 147 insertions(+)
diff --git a/drivers/tty/serial/serial_port.c b/drivers/tty/serial/serial_port.c
index ed3953bd04073..469ad26cde487 100644
--- a/drivers/tty/serial/serial_port.c
+++ b/drivers/tty/serial/serial_port.c
@@ -8,7 +8,10 @@
#include <linux/device.h>
#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
#include <linux/pm_runtime.h>
+#include <linux/property.h>
#include <linux/serial_core.h>
#include <linux/spinlock.h>
@@ -146,6 +149,148 @@ void uart_remove_one_port(struct uart_driver *drv, struct uart_port *port)
}
EXPORT_SYMBOL(uart_remove_one_port);
+/**
+ * __uart_read_properties - read firmware properties of the given UART port
+ * @port: corresponding port
+ * @use_defaults: apply defaults (when %true) or validate the values (when %false)
+ *
+ * The following device properties are supported:
+ * - clock-frequency (optional)
+ * - fifo-size (optional)
+ * - no-loopback-test (optional)
+ * - reg-shift (defaults may apply)
+ * - reg-offset (value may be validated)
+ * - reg-io-width (defaults may apply or value may be validated)
+ * - interrupts (OF only)
+ * - serial [alias ID] (OF only)
+ *
+ * If the port->dev is of struct platform_device type the interrupt line
+ * will be retrieved via platform_get_irq() call against that device.
+ * Otherwise it will be assigned by fwnode_irq_get() call. In both cases
+ * the index 0 of the resource is used.
+ *
+ * The caller is responsible to initialize the following fields of the @port
+ * ->dev (must be valid)
+ * ->flags
+ * ->mapbase
+ * ->mapsize
+ * ->regshift (if @use_defaults is false)
+ * before calling this function. Alternatively the above mentioned fields
+ * may be zeroed, in such case the only ones, that have associated properties
+ * found, will be set to the respective values.
+ *
+ * If no error happened, the ->irq, ->mapbase, ->mapsize will be altered.
+ * The ->iotype is always altered.
+ *
+ * When @use_defaults is true and the respective property is not found
+ * the following values will be applied:
+ * ->regshift = 0
+ * In this case IRQ must be provided, otherwise an error will be returned.
+ *
+ * When @use_defaults is false and the respective property is found
+ * the following values will be validated:
+ * - reg-io-width (->iotype)
+ * - reg-offset (->mapsize against ->mapbase)
+ *
+ * Returns: 0 on success or negative errno on failure
+ */
+static int __uart_read_properties(struct uart_port *port, bool use_defaults)
+{
+ struct device *dev = port->dev;
+ u32 value;
+ int ret;
+
+ /* Read optional UART functional clock frequency */
+ device_property_read_u32(dev, "clock-frequency", &port->uartclk);
+
+ /* Read the registers alignment (default: 8-bit) */
+ ret = device_property_read_u32(dev, "reg-shift", &value);
+ if (ret)
+ port->regshift = use_defaults ? 0 : port->regshift;
+ else
+ port->regshift = value;
+
+ /* Read the registers I/O access type (default: MMIO 8-bit) */
+ ret = device_property_read_u32(dev, "reg-io-width", &value);
+ if (ret) {
+ port->iotype = UPIO_MEM;
+ } else {
+ switch (value) {
+ case 1:
+ port->iotype = UPIO_MEM;
+ break;
+ case 2:
+ port->iotype = UPIO_MEM16;
+ break;
+ case 4:
+ port->iotype = device_is_big_endian(dev) ? UPIO_MEM32BE : UPIO_MEM32;
+ break;
+ default:
+ if (!use_defaults) {
+ dev_err(dev, "Unsupported reg-io-width (%u)\n", value);
+ return -EINVAL;
+ }
+ port->iotype = UPIO_UNKNOWN;
+ break;
+ }
+ }
+
+ /* Read the address mapping base offset (default: no offset) */
+ ret = device_property_read_u32(dev, "reg-offset", &value);
+ if (ret)
+ value = 0;
+
+ /* Check for shifted address mapping overflow */
+ if (!use_defaults && port->mapsize < value) {
+ dev_err(dev, "reg-offset %u exceeds region size %pa\n", value, &port->mapsize);
+ return -EINVAL;
+ }
+
+ port->mapbase += value;
+ port->mapsize -= value;
+
+ /* Read optional FIFO size */
+ device_property_read_u32(dev, "fifo-size", &port->fifosize);
+
+ if (device_property_read_bool(dev, "no-loopback-test"))
+ port->flags |= UPF_SKIP_TEST;
+
+ /* Get index of serial line, if found in DT aliases */
+ ret = of_alias_get_id(dev_of_node(dev), "serial");
+ if (ret >= 0)
+ port->line = ret;
+
+ if (dev_is_platform(dev))
+ ret = platform_get_irq(to_platform_device(dev), 0);
+ else
+ ret = fwnode_irq_get(dev_fwnode(dev), 0);
+ if (ret == -EPROBE_DEFER)
+ return ret;
+ if (ret > 0)
+ port->irq = ret;
+ else if (use_defaults)
+ /* By default IRQ support is mandatory */
+ return ret;
+ else
+ port->irq = 0;
+
+ port->flags |= UPF_SHARE_IRQ;
+
+ return 0;
+}
+
+int uart_read_port_properties(struct uart_port *port)
+{
+ return __uart_read_properties(port, true);
+}
+EXPORT_SYMBOL_GPL(uart_read_port_properties);
+
+int uart_read_and_validate_port_properties(struct uart_port *port)
+{
+ return __uart_read_properties(port, false);
+}
+EXPORT_SYMBOL_GPL(uart_read_and_validate_port_properties);
+
static struct device_driver serial_port_driver = {
.name = "port",
.suppress_bind_attrs = true,
diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h
index 412de73547521..5da5eb719f614 100644
--- a/include/linux/serial_core.h
+++ b/include/linux/serial_core.h
@@ -961,6 +961,8 @@ int uart_register_driver(struct uart_driver *uart);
void uart_unregister_driver(struct uart_driver *uart);
int uart_add_one_port(struct uart_driver *reg, struct uart_port *port);
void uart_remove_one_port(struct uart_driver *reg, struct uart_port *port);
+int uart_read_port_properties(struct uart_port *port);
+int uart_read_and_validate_port_properties(struct uart_port *port);
bool uart_match_port(const struct uart_port *port1,
const struct uart_port *port2);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 260/267] serial: 8250_dw: Switch to use uart_read_port_properties()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (258 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 259/267] serial: port: Introduce a common helper to read properties Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 261/267] serial: 8250_dw: Replace ACPI device check by a quirk Greg Kroah-Hartman
` (18 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Andi Shyti, Andy Shevchenko,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[ Upstream commit e6a46d073e11baba785245860c9f51adbbb8b68d ]
Since we have now a common helper to read port properties
use it instead of sparse home grown solution.
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20240304123035.758700-8-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stable-dep-of: 87d80bfbd577 ("serial: 8250_dw: Don't use struct dw8250_data outside of 8250_dw")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/tty/serial/8250/8250_dw.c | 67 +++++++++++++------------------
1 file changed, 27 insertions(+), 40 deletions(-)
diff --git a/drivers/tty/serial/8250/8250_dw.c b/drivers/tty/serial/8250/8250_dw.c
index a1f2259cc9a98..0446ac145cd4a 100644
--- a/drivers/tty/serial/8250/8250_dw.c
+++ b/drivers/tty/serial/8250/8250_dw.c
@@ -17,7 +17,6 @@
#include <linux/mod_devicetable.h>
#include <linux/module.h>
#include <linux/notifier.h>
-#include <linux/of.h>
#include <linux/platform_device.h>
#include <linux/pm_runtime.h>
#include <linux/property.h>
@@ -449,12 +448,7 @@ static void dw8250_quirks(struct uart_port *p, struct dw8250_data *data)
if (np) {
unsigned int quirks = data->pdata->quirks;
- int id;
- /* get index of serial line, if found in DT aliases */
- id = of_alias_get_id(np, "serial");
- if (id >= 0)
- p->line = id;
#ifdef CONFIG_64BIT
if (quirks & DW_UART_QUIRK_OCTEON) {
p->serial_in = dw8250_serial_inq;
@@ -465,12 +459,6 @@ static void dw8250_quirks(struct uart_port *p, struct dw8250_data *data)
}
#endif
- if (of_device_is_big_endian(np)) {
- p->iotype = UPIO_MEM32BE;
- p->serial_in = dw8250_serial_in32be;
- p->serial_out = dw8250_serial_out32be;
- }
-
if (quirks & DW_UART_QUIRK_ARMADA_38X)
p->serial_out = dw8250_serial_out38x;
if (quirks & DW_UART_QUIRK_SKIP_SET_RATE)
@@ -515,39 +503,21 @@ static int dw8250_probe(struct platform_device *pdev)
struct device *dev = &pdev->dev;
struct dw8250_data *data;
struct resource *regs;
- int irq;
int err;
- u32 val;
regs = platform_get_resource(pdev, IORESOURCE_MEM, 0);
if (!regs)
return dev_err_probe(dev, -EINVAL, "no registers defined\n");
- irq = platform_get_irq_optional(pdev, 0);
- /* no interrupt -> fall back to polling */
- if (irq == -ENXIO)
- irq = 0;
- if (irq < 0)
- return irq;
-
spin_lock_init(&p->lock);
- p->mapbase = regs->start;
- p->irq = irq;
p->handle_irq = dw8250_handle_irq;
p->pm = dw8250_do_pm;
p->type = PORT_8250;
- p->flags = UPF_SHARE_IRQ | UPF_FIXED_PORT;
+ p->flags = UPF_FIXED_PORT;
p->dev = dev;
- p->iotype = UPIO_MEM;
- p->serial_in = dw8250_serial_in;
- p->serial_out = dw8250_serial_out;
p->set_ldisc = dw8250_set_ldisc;
p->set_termios = dw8250_set_termios;
- p->membase = devm_ioremap(dev, regs->start, resource_size(regs));
- if (!p->membase)
- return -ENOMEM;
-
data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL);
if (!data)
return -ENOMEM;
@@ -559,15 +529,35 @@ static int dw8250_probe(struct platform_device *pdev)
data->uart_16550_compatible = device_property_read_bool(dev,
"snps,uart-16550-compatible");
- err = device_property_read_u32(dev, "reg-shift", &val);
- if (!err)
- p->regshift = val;
+ p->mapbase = regs->start;
+ p->mapsize = resource_size(regs);
- err = device_property_read_u32(dev, "reg-io-width", &val);
- if (!err && val == 4) {
- p->iotype = UPIO_MEM32;
+ p->membase = devm_ioremap(dev, p->mapbase, p->mapsize);
+ if (!p->membase)
+ return -ENOMEM;
+
+ err = uart_read_port_properties(p);
+ /* no interrupt -> fall back to polling */
+ if (err == -ENXIO)
+ err = 0;
+ if (err)
+ return err;
+
+ switch (p->iotype) {
+ case UPIO_MEM:
+ p->serial_in = dw8250_serial_in;
+ p->serial_out = dw8250_serial_out;
+ break;
+ case UPIO_MEM32:
p->serial_in = dw8250_serial_in32;
p->serial_out = dw8250_serial_out32;
+ break;
+ case UPIO_MEM32BE:
+ p->serial_in = dw8250_serial_in32be;
+ p->serial_out = dw8250_serial_out32be;
+ break;
+ default:
+ return -ENODEV;
}
if (device_property_read_bool(dev, "dcd-override")) {
@@ -594,9 +584,6 @@ static int dw8250_probe(struct platform_device *pdev)
data->msr_mask_off |= UART_MSR_TERI;
}
- /* Always ask for fixed clock rate from a property. */
- device_property_read_u32(dev, "clock-frequency", &p->uartclk);
-
/* If there is separate baudclk, get the rate from it. */
data->clk = devm_clk_get_optional(dev, "baudclk");
if (data->clk == NULL)
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 261/267] serial: 8250_dw: Replace ACPI device check by a quirk
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (259 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 260/267] serial: 8250_dw: Switch to use uart_read_port_properties() Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 262/267] serial: 8250_dw: Dont use struct dw8250_data outside of 8250_dw Greg Kroah-Hartman
` (17 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Andy Shevchenko, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[ Upstream commit 173b097dcc8d74d6e135aed1bad38dbfa21c4d04 ]
Instead of checking for APMC0D08 ACPI device presence,
use a quirk based on driver data.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20240306143322.3291123-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stable-dep-of: 87d80bfbd577 ("serial: 8250_dw: Don't use struct dw8250_data outside of 8250_dw")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/tty/serial/8250/8250_dw.c | 51 ++++++++++++++++---------------
1 file changed, 26 insertions(+), 25 deletions(-)
diff --git a/drivers/tty/serial/8250/8250_dw.c b/drivers/tty/serial/8250/8250_dw.c
index 0446ac145cd4a..a7659e536d3c0 100644
--- a/drivers/tty/serial/8250/8250_dw.c
+++ b/drivers/tty/serial/8250/8250_dw.c
@@ -9,7 +9,6 @@
* LCR is written whilst busy. If it is, then a busy detect interrupt is
* raised, the LCR needs to be rewritten and the uart status register read.
*/
-#include <linux/acpi.h>
#include <linux/clk.h>
#include <linux/delay.h>
#include <linux/device.h>
@@ -55,6 +54,7 @@
#define DW_UART_QUIRK_ARMADA_38X BIT(1)
#define DW_UART_QUIRK_SKIP_SET_RATE BIT(2)
#define DW_UART_QUIRK_IS_DMA_FC BIT(3)
+#define DW_UART_QUIRK_APMC0D08 BIT(4)
static inline struct dw8250_data *clk_to_dw8250_data(struct notifier_block *nb)
{
@@ -444,33 +444,29 @@ static void dw8250_prepare_rx_dma(struct uart_8250_port *p)
static void dw8250_quirks(struct uart_port *p, struct dw8250_data *data)
{
- struct device_node *np = p->dev->of_node;
-
- if (np) {
- unsigned int quirks = data->pdata->quirks;
+ unsigned int quirks = data->pdata ? data->pdata->quirks : 0;
#ifdef CONFIG_64BIT
- if (quirks & DW_UART_QUIRK_OCTEON) {
- p->serial_in = dw8250_serial_inq;
- p->serial_out = dw8250_serial_outq;
- p->flags = UPF_SKIP_TEST | UPF_SHARE_IRQ | UPF_FIXED_TYPE;
- p->type = PORT_OCTEON;
- data->skip_autocfg = true;
- }
+ if (quirks & DW_UART_QUIRK_OCTEON) {
+ p->serial_in = dw8250_serial_inq;
+ p->serial_out = dw8250_serial_outq;
+ p->flags = UPF_SKIP_TEST | UPF_SHARE_IRQ | UPF_FIXED_TYPE;
+ p->type = PORT_OCTEON;
+ data->skip_autocfg = true;
+ }
#endif
- if (quirks & DW_UART_QUIRK_ARMADA_38X)
- p->serial_out = dw8250_serial_out38x;
- if (quirks & DW_UART_QUIRK_SKIP_SET_RATE)
- p->set_termios = dw8250_do_set_termios;
- if (quirks & DW_UART_QUIRK_IS_DMA_FC) {
- data->data.dma.txconf.device_fc = 1;
- data->data.dma.rxconf.device_fc = 1;
- data->data.dma.prepare_tx_dma = dw8250_prepare_tx_dma;
- data->data.dma.prepare_rx_dma = dw8250_prepare_rx_dma;
- }
-
- } else if (acpi_dev_present("APMC0D08", NULL, -1)) {
+ if (quirks & DW_UART_QUIRK_ARMADA_38X)
+ p->serial_out = dw8250_serial_out38x;
+ if (quirks & DW_UART_QUIRK_SKIP_SET_RATE)
+ p->set_termios = dw8250_do_set_termios;
+ if (quirks & DW_UART_QUIRK_IS_DMA_FC) {
+ data->data.dma.txconf.device_fc = 1;
+ data->data.dma.rxconf.device_fc = 1;
+ data->data.dma.prepare_tx_dma = dw8250_prepare_tx_dma;
+ data->data.dma.prepare_rx_dma = dw8250_prepare_rx_dma;
+ }
+ if (quirks & DW_UART_QUIRK_APMC0D08) {
p->iotype = UPIO_MEM32;
p->regshift = 2;
p->serial_in = dw8250_serial_in32;
@@ -772,13 +768,18 @@ static const struct of_device_id dw8250_of_match[] = {
};
MODULE_DEVICE_TABLE(of, dw8250_of_match);
+static const struct dw8250_platform_data dw8250_apmc0d08 = {
+ .usr_reg = DW_UART_USR,
+ .quirks = DW_UART_QUIRK_APMC0D08,
+};
+
static const struct acpi_device_id dw8250_acpi_match[] = {
{ "80860F0A", (kernel_ulong_t)&dw8250_dw_apb },
{ "8086228A", (kernel_ulong_t)&dw8250_dw_apb },
{ "AMD0020", (kernel_ulong_t)&dw8250_dw_apb },
{ "AMDI0020", (kernel_ulong_t)&dw8250_dw_apb },
{ "AMDI0022", (kernel_ulong_t)&dw8250_dw_apb },
- { "APMC0D08", (kernel_ulong_t)&dw8250_dw_apb},
+ { "APMC0D08", (kernel_ulong_t)&dw8250_apmc0d08 },
{ "BRCM2032", (kernel_ulong_t)&dw8250_dw_apb },
{ "HISI0031", (kernel_ulong_t)&dw8250_dw_apb },
{ "INT33C4", (kernel_ulong_t)&dw8250_dw_apb },
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 262/267] serial: 8250_dw: Dont use struct dw8250_data outside of 8250_dw
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (260 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 261/267] serial: 8250_dw: Replace ACPI device check by a quirk Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 263/267] usb-storage: alauda: Check whether the media is initialized Greg Kroah-Hartman
` (16 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable; +Cc: Greg Kroah-Hartman, patches, Andy Shevchenko, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
[ Upstream commit 87d80bfbd577912462061b1a45c0ed9c7fcb872f ]
The container of the struct dw8250_port_data is private to the actual
driver. In particular, 8250_lpss and 8250_dw use different data types
that are assigned to the UART port private_data. Hence, it must not
be used outside the specific driver.
Currently the only cpr_val is required by the common code, make it
be available via struct dw8250_port_data.
This fixes the UART breakage on Intel Galileo boards.
Fixes: 593dea000bc1 ("serial: 8250: dw: Allow to use a fallback CPR value if not synthesized")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20240514190730.2787071-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/tty/serial/8250/8250_dw.c | 9 +++++++--
drivers/tty/serial/8250/8250_dwlib.c | 3 +--
drivers/tty/serial/8250/8250_dwlib.h | 3 ++-
3 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/drivers/tty/serial/8250/8250_dw.c b/drivers/tty/serial/8250/8250_dw.c
index a7659e536d3c0..777bea835b114 100644
--- a/drivers/tty/serial/8250/8250_dw.c
+++ b/drivers/tty/serial/8250/8250_dw.c
@@ -55,6 +55,7 @@
#define DW_UART_QUIRK_SKIP_SET_RATE BIT(2)
#define DW_UART_QUIRK_IS_DMA_FC BIT(3)
#define DW_UART_QUIRK_APMC0D08 BIT(4)
+#define DW_UART_QUIRK_CPR_VALUE BIT(5)
static inline struct dw8250_data *clk_to_dw8250_data(struct notifier_block *nb)
{
@@ -445,6 +446,10 @@ static void dw8250_prepare_rx_dma(struct uart_8250_port *p)
static void dw8250_quirks(struct uart_port *p, struct dw8250_data *data)
{
unsigned int quirks = data->pdata ? data->pdata->quirks : 0;
+ u32 cpr_value = data->pdata ? data->pdata->cpr_value : 0;
+
+ if (quirks & DW_UART_QUIRK_CPR_VALUE)
+ data->data.cpr_value = cpr_value;
#ifdef CONFIG_64BIT
if (quirks & DW_UART_QUIRK_OCTEON) {
@@ -749,8 +754,8 @@ static const struct dw8250_platform_data dw8250_armada_38x_data = {
static const struct dw8250_platform_data dw8250_renesas_rzn1_data = {
.usr_reg = DW_UART_USR,
- .cpr_val = 0x00012f32,
- .quirks = DW_UART_QUIRK_IS_DMA_FC,
+ .cpr_value = 0x00012f32,
+ .quirks = DW_UART_QUIRK_CPR_VALUE | DW_UART_QUIRK_IS_DMA_FC,
};
static const struct dw8250_platform_data dw8250_starfive_jh7100_data = {
diff --git a/drivers/tty/serial/8250/8250_dwlib.c b/drivers/tty/serial/8250/8250_dwlib.c
index 84843e204a5e8..8fc8b6753148b 100644
--- a/drivers/tty/serial/8250/8250_dwlib.c
+++ b/drivers/tty/serial/8250/8250_dwlib.c
@@ -242,7 +242,6 @@ static const struct serial_rs485 dw8250_rs485_supported = {
void dw8250_setup_port(struct uart_port *p)
{
struct dw8250_port_data *pd = p->private_data;
- struct dw8250_data *data = to_dw8250_data(pd);
struct uart_8250_port *up = up_to_u8250p(p);
u32 reg, old_dlf;
@@ -284,7 +283,7 @@ void dw8250_setup_port(struct uart_port *p)
reg = dw8250_readl_ext(p, DW_UART_CPR);
if (!reg) {
- reg = data->pdata->cpr_val;
+ reg = pd->cpr_value;
dev_dbg(p->dev, "CPR is not available, using 0x%08x instead\n", reg);
}
if (!reg)
diff --git a/drivers/tty/serial/8250/8250_dwlib.h b/drivers/tty/serial/8250/8250_dwlib.h
index f13e91f2cace9..794a9014cdac1 100644
--- a/drivers/tty/serial/8250/8250_dwlib.h
+++ b/drivers/tty/serial/8250/8250_dwlib.h
@@ -19,6 +19,7 @@ struct dw8250_port_data {
struct uart_8250_dma dma;
/* Hardware configuration */
+ u32 cpr_value;
u8 dlf_size;
/* RS485 variables */
@@ -27,7 +28,7 @@ struct dw8250_port_data {
struct dw8250_platform_data {
u8 usr_reg;
- u32 cpr_val;
+ u32 cpr_value;
unsigned int quirks;
};
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 263/267] usb-storage: alauda: Check whether the media is initialized
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (261 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 262/267] serial: 8250_dw: Dont use struct dw8250_data outside of 8250_dw Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 264/267] misc: microchip: pci1xxxx: Fix a memory leak in the error handling of gp_aux_bus_probe() Greg Kroah-Hartman
` (15 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, xingwei lee, yue sun, Alan Stern,
Shichao Lai, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shichao Lai <shichaorai@gmail.com>
[ Upstream commit 16637fea001ab3c8df528a8995b3211906165a30 ]
The member "uzonesize" of struct alauda_info will remain 0
if alauda_init_media() fails, potentially causing divide errors
in alauda_read_data() and alauda_write_lba().
- Add a member "media_initialized" to struct alauda_info.
- Change a condition in alauda_check_media() to ensure the
first initialization.
- Add an error check for the return value of alauda_init_media().
Fixes: e80b0fade09e ("[PATCH] USB Storage: add alauda support")
Reported-by: xingwei lee <xrivendell7@gmail.com>
Reported-by: yue sun <samsun1006219@gmail.com>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Shichao Lai <shichaorai@gmail.com>
Link: https://lore.kernel.org/r/20240526012745.2852061-1-shichaorai@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/usb/storage/alauda.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/storage/alauda.c b/drivers/usb/storage/alauda.c
index 115f05a6201a1..40d34cc28344a 100644
--- a/drivers/usb/storage/alauda.c
+++ b/drivers/usb/storage/alauda.c
@@ -105,6 +105,8 @@ struct alauda_info {
unsigned char sense_key;
unsigned long sense_asc; /* additional sense code */
unsigned long sense_ascq; /* additional sense code qualifier */
+
+ bool media_initialized;
};
#define short_pack(lsb,msb) ( ((u16)(lsb)) | ( ((u16)(msb))<<8 ) )
@@ -476,11 +478,12 @@ static int alauda_check_media(struct us_data *us)
}
/* Check for media change */
- if (status[0] & 0x08) {
+ if (status[0] & 0x08 || !info->media_initialized) {
usb_stor_dbg(us, "Media change detected\n");
alauda_free_maps(&MEDIA_INFO(us));
- alauda_init_media(us);
-
+ rc = alauda_init_media(us);
+ if (rc == USB_STOR_TRANSPORT_GOOD)
+ info->media_initialized = true;
info->sense_key = UNIT_ATTENTION;
info->sense_asc = 0x28;
info->sense_ascq = 0x00;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 264/267] misc: microchip: pci1xxxx: Fix a memory leak in the error handling of gp_aux_bus_probe()
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (262 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 263/267] usb-storage: alauda: Check whether the media is initialized Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 265/267] i2c: at91: Fix the functionality flags of the slave-only interface Greg Kroah-Hartman
` (14 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Yongzhi Liu, Kumaravel Thiagarajan,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yongzhi Liu <hyperlyzcs@gmail.com>
[ Upstream commit 77427e3d5c353e3dd98c7c0af322f8d9e3131ace ]
There is a memory leak (forget to free allocated buffers) in a
memory allocation failure path.
Fix it to jump to the correct error handling code.
Fixes: 393fc2f5948f ("misc: microchip: pci1xxxx: load auxiliary bus driver for the PIO function in the multi-function endpoint of pci1xxxx device.")
Signed-off-by: Yongzhi Liu <hyperlyzcs@gmail.com>
Reviewed-by: Kumaravel Thiagarajan <kumaravel.thiagarajan@microchip.com>
Link: https://lore.kernel.org/r/20240523121434.21855-4-hyperlyzcs@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gp.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gp.c b/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gp.c
index de75d89ef53e8..34c9be437432a 100644
--- a/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gp.c
+++ b/drivers/misc/mchp_pci1xxxx/mchp_pci1xxxx_gp.c
@@ -69,8 +69,10 @@ static int gp_aux_bus_probe(struct pci_dev *pdev, const struct pci_device_id *id
aux_bus->aux_device_wrapper[1] = kzalloc(sizeof(*aux_bus->aux_device_wrapper[1]),
GFP_KERNEL);
- if (!aux_bus->aux_device_wrapper[1])
- return -ENOMEM;
+ if (!aux_bus->aux_device_wrapper[1]) {
+ retval = -ENOMEM;
+ goto err_aux_dev_add_0;
+ }
retval = ida_alloc(&gp_client_ida, GFP_KERNEL);
if (retval < 0)
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 265/267] i2c: at91: Fix the functionality flags of the slave-only interface
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (263 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 264/267] misc: microchip: pci1xxxx: Fix a memory leak in the error handling of gp_aux_bus_probe() Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 266/267] i2c: designware: " Greg Kroah-Hartman
` (13 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Jean Delvare, Juergen Fitschen,
Ludovic Desroches, Codrin Ciubotariu, Andi Shyti, Nicolas Ferre,
Alexandre Belloni, Claudiu Beznea, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jean Delvare <jdelvare@suse.de>
[ Upstream commit d6d5645e5fc1233a7ba950de4a72981c394a2557 ]
When an I2C adapter acts only as a slave, it should not claim to
support I2C master capabilities.
Fixes: 9d3ca54b550c ("i2c: at91: added slave mode support")
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Juergen Fitschen <me@jue.yt>
Cc: Ludovic Desroches <ludovic.desroches@microchip.com>
Cc: Codrin Ciubotariu <codrin.ciubotariu@microchip.com>
Cc: Andi Shyti <andi.shyti@kernel.org>
Cc: Nicolas Ferre <nicolas.ferre@microchip.com>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Claudiu Beznea <claudiu.beznea@tuxon.dev>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/i2c/busses/i2c-at91-slave.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/i2c/busses/i2c-at91-slave.c b/drivers/i2c/busses/i2c-at91-slave.c
index d6eeea5166c04..131a67d9d4a68 100644
--- a/drivers/i2c/busses/i2c-at91-slave.c
+++ b/drivers/i2c/busses/i2c-at91-slave.c
@@ -106,8 +106,7 @@ static int at91_unreg_slave(struct i2c_client *slave)
static u32 at91_twi_func(struct i2c_adapter *adapter)
{
- return I2C_FUNC_SLAVE | I2C_FUNC_I2C | I2C_FUNC_SMBUS_EMUL
- | I2C_FUNC_SMBUS_READ_BLOCK_DATA;
+ return I2C_FUNC_SLAVE;
}
static const struct i2c_algorithm at91_twi_algorithm_slave = {
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 266/267] i2c: designware: Fix the functionality flags of the slave-only interface
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (264 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 265/267] i2c: at91: Fix the functionality flags of the slave-only interface Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 12:56 ` [PATCH 6.6 267/267] zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING Greg Kroah-Hartman
` (12 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Jean Delvare, Luis Oliveira,
Jarkko Nikula, Andy Shevchenko, Mika Westerberg, Jan Dabros,
Andi Shyti, Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jean Delvare <jdelvare@suse.de>
[ Upstream commit cbf3fb5b29e99e3689d63a88c3cddbffa1b8de99 ]
When an I2C adapter acts only as a slave, it should not claim to
support I2C master capabilities.
Fixes: 5b6d721b266a ("i2c: designware: enable SLAVE in platform module")
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Luis Oliveira <lolivei@synopsys.com>
Cc: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Jan Dabros <jsd@semihalf.com>
Cc: Andi Shyti <andi.shyti@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/i2c/busses/i2c-designware-slave.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-designware-slave.c b/drivers/i2c/busses/i2c-designware-slave.c
index 2e079cf20bb5b..78e2c47e3d7da 100644
--- a/drivers/i2c/busses/i2c-designware-slave.c
+++ b/drivers/i2c/busses/i2c-designware-slave.c
@@ -220,7 +220,7 @@ static const struct i2c_algorithm i2c_dw_algo = {
void i2c_dw_configure_slave(struct dw_i2c_dev *dev)
{
- dev->functionality = I2C_FUNC_SLAVE | DW_IC_DEFAULT_FUNCTIONALITY;
+ dev->functionality = I2C_FUNC_SLAVE;
dev->slave_cfg = DW_IC_CON_RX_FIFO_FULL_HLD_CTRL |
DW_IC_CON_RESTART_EN | DW_IC_CON_STOP_DET_IFADDRESSED;
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* [PATCH 6.6 267/267] zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (265 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 266/267] i2c: designware: " Greg Kroah-Hartman
@ 2024-06-19 12:56 ` Greg Kroah-Hartman
2024-06-19 14:33 ` [PATCH 6.6 000/267] 6.6.35-rc1 review Florian Fainelli
` (11 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-19 12:56 UTC (permalink / raw)
To: stable
Cc: Greg Kroah-Hartman, patches, Oleg Nesterov, Rachel Menge,
Boqun Feng, Wei Fu, Jens Axboe, Allen Pais, Christian Brauner,
Frederic Weisbecker, Joel Fernandes (Google), Joel Granados,
Josh Triplett, Lai Jiangshan, Mateusz Guzik, Mathieu Desnoyers,
Mike Christie, Neeraj Upadhyay, Paul E. McKenney,
Steven Rostedt (Google), Zqiang, Thomas Gleixner, Andrew Morton,
Sasha Levin
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oleg Nesterov <oleg@redhat.com>
[ Upstream commit 7fea700e04bd3f424c2d836e98425782f97b494e ]
kernel_wait4() doesn't sleep and returns -EINTR if there is no
eligible child and signal_pending() is true.
That is why zap_pid_ns_processes() clears TIF_SIGPENDING but this is not
enough, it should also clear TIF_NOTIFY_SIGNAL to make signal_pending()
return false and avoid a busy-wait loop.
Link: https://lkml.kernel.org/r/20240608120616.GB7947@redhat.com
Fixes: 12db8b690010 ("entry: Add support for TIF_NOTIFY_SIGNAL")
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Rachel Menge <rachelmenge@linux.microsoft.com>
Closes: https://lore.kernel.org/all/1386cd49-36d0-4a5c-85e9-bc42056a5a38@linux.microsoft.com/
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Tested-by: Wei Fu <fuweid89@gmail.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Cc: Allen Pais <apais@linux.microsoft.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Joel Granados <j.granados@samsung.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Zqiang <qiang.zhang1211@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
kernel/pid_namespace.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
index 619972c78774f..e9b2bb260ee6c 100644
--- a/kernel/pid_namespace.c
+++ b/kernel/pid_namespace.c
@@ -217,6 +217,7 @@ void zap_pid_ns_processes(struct pid_namespace *pid_ns)
*/
do {
clear_thread_flag(TIF_SIGPENDING);
+ clear_thread_flag(TIF_NOTIFY_SIGNAL);
rc = kernel_wait4(-1, NULL, __WALL, NULL);
} while (rc != -ECHILD);
--
2.43.0
^ permalink raw reply related [flat|nested] 282+ messages in thread* Re: [PATCH 6.6 000/267] 6.6.35-rc1 review
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (266 preceding siblings ...)
2024-06-19 12:56 ` [PATCH 6.6 267/267] zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING Greg Kroah-Hartman
@ 2024-06-19 14:33 ` Florian Fainelli
2024-06-19 16:17 ` Harshit Mogalapalli
` (10 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Florian Fainelli @ 2024-06-19 14:33 UTC (permalink / raw)
To: Greg Kroah-Hartman, stable
Cc: patches, linux-kernel, torvalds, akpm, linux, shuah, patches,
lkft-triage, pavel, jonathanh, sudipm.mukherjee, srw, rwarsow,
conor, allen.lkml, broonie
On 6/19/2024 1:52 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.6.35 release.
> There are 267 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.35-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on
BMIPS_GENERIC:
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
--
Florian
^ permalink raw reply [flat|nested] 282+ messages in thread* Re: [PATCH 6.6 000/267] 6.6.35-rc1 review
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (267 preceding siblings ...)
2024-06-19 14:33 ` [PATCH 6.6 000/267] 6.6.35-rc1 review Florian Fainelli
@ 2024-06-19 16:17 ` Harshit Mogalapalli
2024-06-19 17:07 ` SeongJae Park
` (9 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Harshit Mogalapalli @ 2024-06-19 16:17 UTC (permalink / raw)
To: Greg Kroah-Hartman, stable
Cc: patches, linux-kernel, torvalds, akpm, linux, shuah, patches,
lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw,
rwarsow, conor, allen.lkml, broonie, Vegard Nossum
Hi Greg,
On 19/06/24 18:22, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.6.35 release.
> There are 267 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000.
> Anything received after that time might be too late.
>
No problems seen on x86_64 and aarch64 with our testing.
Tested-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Thanks,
Harshit
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.35-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>
^ permalink raw reply [flat|nested] 282+ messages in thread* Re: [PATCH 6.6 000/267] 6.6.35-rc1 review
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (268 preceding siblings ...)
2024-06-19 16:17 ` Harshit Mogalapalli
@ 2024-06-19 17:07 ` SeongJae Park
2024-06-19 19:19 ` Jon Hunter
` (8 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: SeongJae Park @ 2024-06-19 17:07 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: SeongJae Park, stable, patches, linux-kernel, torvalds, akpm,
linux, shuah, patches, lkft-triage, pavel, jonathanh, f.fainelli,
sudipm.mukherjee, srw, rwarsow, conor, allen.lkml, broonie, damon
Hello,
On Wed, 19 Jun 2024 14:52:31 +0200 Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> This is the start of the stable review cycle for the 6.6.35 release.
> There are 267 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.35-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y
> and the diffstat can be found below.
This rc kernel passes DAMON functionality test[1] on my test machine.
Attaching the test results summary below. Please note that I retrieved the
kernel from linux-stable-rc tree[2].
Tested-by: SeongJae Park <sj@kernel.org>
[1] https://github.com/awslabs/damon-tests/tree/next/corr
[2] 0db1e58b51e3 ("Linux 6.6.35-rc1")
Thanks,
SJ
[...]
---
ok 1 selftests: damon: debugfs_attrs.sh
ok 2 selftests: damon: debugfs_schemes.sh
ok 3 selftests: damon: debugfs_target_ids.sh
ok 4 selftests: damon: debugfs_empty_targets.sh
ok 5 selftests: damon: debugfs_huge_count_read_write.sh
ok 6 selftests: damon: debugfs_duplicate_context_creation.sh
ok 7 selftests: damon: debugfs_rm_non_contexts.sh
ok 8 selftests: damon: sysfs.sh
ok 9 selftests: damon: sysfs_update_removed_scheme_dir.sh
ok 10 selftests: damon: reclaim.sh
ok 11 selftests: damon: lru_sort.sh
ok 1 selftests: damon-tests: kunit.sh
ok 2 selftests: damon-tests: huge_count_read_write.sh
ok 3 selftests: damon-tests: buffer_overflow.sh
ok 4 selftests: damon-tests: rm_contexts.sh
ok 5 selftests: damon-tests: record_null_deref.sh
ok 6 selftests: damon-tests: dbgfs_target_ids_read_before_terminate_race.sh
ok 7 selftests: damon-tests: dbgfs_target_ids_pid_leak.sh
ok 8 selftests: damon-tests: damo_tests.sh
ok 9 selftests: damon-tests: masim-record.sh
ok 10 selftests: damon-tests: build_i386.sh
ok 11 selftests: damon-tests: build_arm64.sh
ok 12 selftests: damon-tests: build_m68k.sh
ok 13 selftests: damon-tests: build_i386_idle_flag.sh
ok 14 selftests: damon-tests: build_i386_highpte.sh
ok 15 selftests: damon-tests: build_nomemcg.sh
[33m
[92mPASS [39m
^ permalink raw reply [flat|nested] 282+ messages in thread* Re: [PATCH 6.6 000/267] 6.6.35-rc1 review
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (269 preceding siblings ...)
2024-06-19 17:07 ` SeongJae Park
@ 2024-06-19 19:19 ` Jon Hunter
2024-06-19 21:11 ` Allen
` (7 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Jon Hunter @ 2024-06-19 19:19 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Greg Kroah-Hartman, patches, linux-kernel, torvalds, akpm, linux,
shuah, patches, lkft-triage, pavel, jonathanh, f.fainelli,
sudipm.mukherjee, srw, rwarsow, conor, allen.lkml, broonie,
linux-tegra, stable
On Wed, 19 Jun 2024 14:52:31 +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.6.35 release.
> There are 267 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.35-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
All tests passing for Tegra ...
Test results for stable-v6.6:
10 builds: 10 pass, 0 fail
26 boots: 26 pass, 0 fail
116 tests: 116 pass, 0 fail
Linux version: 6.6.35-rc1-g0db1e58b51e3
Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000,
tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000,
tegra20-ventana, tegra210-p2371-2180,
tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Jon
^ permalink raw reply [flat|nested] 282+ messages in thread* Re: [PATCH 6.6 000/267] 6.6.35-rc1 review
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (270 preceding siblings ...)
2024-06-19 19:19 ` Jon Hunter
@ 2024-06-19 21:11 ` Allen
2024-06-20 3:44 ` Kelsey Steele
` (6 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Allen @ 2024-06-19 21:11 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
patches, lkft-triage, pavel, jonathanh, f.fainelli,
sudipm.mukherjee, srw, rwarsow, conor, broonie
> This is the start of the stable review cycle for the 6.6.35 release.
> There are 267 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.35-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>
Compiled and booted on my x86_64 and ARM64 test systems. No errors or
regressions.
Tested-by: Allen Pais <apais@linux.microsoft.com>
Thanks.
^ permalink raw reply [flat|nested] 282+ messages in thread* Re: [PATCH 6.6 000/267] 6.6.35-rc1 review
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (271 preceding siblings ...)
2024-06-19 21:11 ` Allen
@ 2024-06-20 3:44 ` Kelsey Steele
2024-06-20 11:40 ` Mark Brown
` (5 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Kelsey Steele @ 2024-06-20 3:44 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
patches, lkft-triage, pavel, jonathanh, f.fainelli,
sudipm.mukherjee, srw, rwarsow, conor, allen.lkml, broonie
On Wed, Jun 19, 2024 at 02:52:31PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.6.35 release.
> There are 267 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000.
> Anything received after that time might be too late.
>
No regressions found on WSL (x86 and arm64).
Built, booted, and reviewed dmesg.
Thank you. :)
Tested-by: Kelsey Steele <kelseysteele@linux.microsoft.com>
^ permalink raw reply [flat|nested] 282+ messages in thread* Re: [PATCH 6.6 000/267] 6.6.35-rc1 review
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (272 preceding siblings ...)
2024-06-20 3:44 ` Kelsey Steele
@ 2024-06-20 11:40 ` Mark Brown
2024-06-20 11:47 ` Takeshi Ogasawara
` (4 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Mark Brown @ 2024-06-20 11:40 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
patches, lkft-triage, pavel, jonathanh, f.fainelli,
sudipm.mukherjee, srw, rwarsow, conor, allen.lkml
[-- Attachment #1: Type: text/plain, Size: 345 bytes --]
On Wed, Jun 19, 2024 at 02:52:31PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.6.35 release.
> There are 267 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
Tested-by: Mark Brown <broonie@kernel.org>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 282+ messages in thread* Re: [PATCH 6.6 000/267] 6.6.35-rc1 review
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (273 preceding siblings ...)
2024-06-20 11:40 ` Mark Brown
@ 2024-06-20 11:47 ` Takeshi Ogasawara
2024-06-20 14:26 ` Ron Economos
` (3 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Takeshi Ogasawara @ 2024-06-20 11:47 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
patches, lkft-triage, pavel, jonathanh, f.fainelli,
sudipm.mukherjee, srw, rwarsow, conor, allen.lkml, broonie
Hi Greg
On Wed, Jun 19, 2024 at 9:59 PM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> This is the start of the stable review cycle for the 6.6.35 release.
> There are 267 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.35-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>
6.6.35-rc1 tested.
Build successfully completed.
Boot successfully completed.
No dmesg regressions.
Video output normal.
Sound output normal.
Lenovo ThinkPad X1 Carbon Gen10(Intel i7-1260P(x86_64) arch linux)
[ 0.000000] Linux version 6.6.35-rc1rv
(takeshi@ThinkPadX1Gen10J0764) (gcc (GCC) 14.1.1 20240522, GNU ld (GNU
Binutils) 2.42.0) #1 SMP PREEMPT_DYNAMIC Thu Jun 20 19:16:30 JST 2024
Thanks
Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com>
^ permalink raw reply [flat|nested] 282+ messages in thread* Re: [PATCH 6.6 000/267] 6.6.35-rc1 review
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (274 preceding siblings ...)
2024-06-20 11:47 ` Takeshi Ogasawara
@ 2024-06-20 14:26 ` Ron Economos
2024-06-20 16:40 ` Naresh Kamboju
` (2 subsequent siblings)
278 siblings, 0 replies; 282+ messages in thread
From: Ron Economos @ 2024-06-20 14:26 UTC (permalink / raw)
To: Greg Kroah-Hartman, stable
Cc: patches, linux-kernel, torvalds, akpm, linux, shuah, patches,
lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw,
rwarsow, conor, allen.lkml, broonie
On 6/19/24 5:52 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.6.35 release.
> There are 267 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.35-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos <re@w6rz.net>
^ permalink raw reply [flat|nested] 282+ messages in thread* Re: [PATCH 6.6 000/267] 6.6.35-rc1 review
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (275 preceding siblings ...)
2024-06-20 14:26 ` Ron Economos
@ 2024-06-20 16:40 ` Naresh Kamboju
2024-06-20 18:57 ` Peter Schneider
2024-06-20 21:35 ` Shuah Khan
278 siblings, 0 replies; 282+ messages in thread
From: Naresh Kamboju @ 2024-06-20 16:40 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
patches, lkft-triage, pavel, jonathanh, f.fainelli,
sudipm.mukherjee, srw, rwarsow, conor, allen.lkml, broonie
On Wed, 19 Jun 2024 at 18:28, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> This is the start of the stable review cycle for the 6.6.35 release.
> There are 267 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.35-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
## Build
* kernel: 6.6.35-rc1
* git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc
* git branch: linux-6.6.y
* git commit: 0db1e58b51e37c5cbaa81a40104bd54aad052427
* git describe: v6.6.34-268-g0db1e58b51e3
* test details:
https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.6.y/build/v6.6.34-268-g0db1e58b51e3
## Test Regressions (compared to v6.6.34)
## Metric Regressions (compared to v6.6.34)
## Test Fixes (compared to v6.6.34)
## Metric Fixes (compared to v6.6.34)
## Test result summary
total: 256850, pass: 222555, fail: 3269, skip: 30682, xfail: 344
## Build Summary
* arc: 5 total, 5 passed, 0 failed
* arm: 129 total, 129 passed, 0 failed
* arm64: 38 total, 38 passed, 0 failed
* i386: 29 total, 29 passed, 0 failed
* mips: 24 total, 24 passed, 0 failed
* parisc: 3 total, 3 passed, 0 failed
* powerpc: 34 total, 34 passed, 0 failed
* riscv: 17 total, 17 passed, 0 failed
* s390: 12 total, 12 passed, 0 failed
* sh: 10 total, 10 passed, 0 failed
* sparc: 6 total, 6 passed, 0 failed
* x86_64: 33 total, 33 passed, 0 failed
## Test suites summary
* boot
* kselft[
* kselftest-android
* kselftest-arm64
* kselftest-breakpoints
* kselftest-capabilities
* kselftest-cgroup
* kselftest-clone3
* kselftest-core
* kselftest-cpu-hotplug
* kselftest-cpufreq
* kselftest-drivers-dma-buf
* kselftest-efivarfs
* kselftest-exec
* kselftest-filesystems
* kselftest-filesystems-binderfs
* kselftest-filesystems-epoll
* kselftest-firmware
* kselftest-fpu
* kselftest-ftrace
* kselftest-futex
* kselftest-gpio
* kselftest-intel_pstate
* kselftest-ipc
* kselftest-ir
* kselftest-kcmp
* kselftest-kexec
* kselftest-kvm
* kselftest-lib
* kselftest-livepatch
* kselftest-membarrier
* kselftest-memfd
* kselftest-memory-hotplug
* kselftest-mincore
* kselftest-mm
* kselftest-mount
* kselftest-mqueue
* kselftest-net
* kselftest-net-forwarding
* kselftest-net-mptcp
* kselftest-netfilter
* kselftest-nsfs
* kselftest-openat2
* kselftest-pid_namespace
* kselftest-pidfd
* kselftest-proc
* kselftest-pstore
* kselftest-ptrace
* kselftest-rseq
* kselftest-rtc
* kselftest-seccomp
* kselftest-sigaltstack
* kselftest-size
* kselftest-splice
* kselftest-static_keys
* kselftest-sync
* kselftest-sysctl
* kselftest-tc-testing
* kselftest-timens
* kselftest-timers
* kselftest-tmpfs
* kselftest-tpm2
* kselftest-user
* kselftest-user_events
* kselftest-vDSO
* kselftest-watchdog
* kselftest-x86
* kselftest-zram
* kunit
* kvm-unit-tests
* libgpiod
* libhugetlbfs
* log-parser-boot
* log-parser-test
* ltp-commands
* ltp-containers
* ltp-controllers
* ltp-cpuhotplug
* ltp-crypto
* ltp-cve
* ltp-dio
* ltp-fcntl-locktests
* ltp-fs
* ltp-fs_bind
* ltp-fs_perms_simple
* ltp-hugetlb
* ltp-ipc
* ltp-math
* ltp-mm
* ltp-nptl
* ltp-pty
* ltp-sched
* ltp-smoke
* ltp-smoketest
* ltp-syscalls
* ltp-tracing
* perf
* rcutorture
--
Linaro LKFT
https://lkft.linaro.org
^ permalink raw reply [flat|nested] 282+ messages in thread* Re: [PATCH 6.6 000/267] 6.6.35-rc1 review
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (276 preceding siblings ...)
2024-06-20 16:40 ` Naresh Kamboju
@ 2024-06-20 18:57 ` Peter Schneider
2024-06-20 21:35 ` Shuah Khan
278 siblings, 0 replies; 282+ messages in thread
From: Peter Schneider @ 2024-06-20 18:57 UTC (permalink / raw)
To: Greg Kroah-Hartman, stable
Cc: patches, linux-kernel, torvalds, akpm, linux, shuah, patches,
lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw,
rwarsow, conor, allen.lkml, broonie
Am 19.06.2024 um 14:52 schrieb Greg Kroah-Hartman:
> This is the start of the stable review cycle for the 6.6.35 release.
> There are 267 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
What can you do between two soccer matches? Build and test an -rc kernel... ;-)
Built and booted successfully, works without regressions (built another kernel with it as
load test, using 48 threads), no dmesg oddities. Dual socket Ivy Bridge Xeon E5-2697 v2.
Tested-by: Peter Schneider <pschneider1968@googlemail.com>
Beste Grüße,
Peter Schneider
--
Climb the mountain not to plant your flag, but to embrace the challenge,
enjoy the air and behold the view. Climb it so you can see the world,
not so the world can see you. -- David McCullough Jr.
OpenPGP: 0xA3828BD796CCE11A8CADE8866E3A92C92C3FF244
Download: https://www.peters-netzplatz.de/download/pschneider1968_pub.asc
https://keys.mailvelope.com/pks/lookup?op=get&search=pschneider1968@googlemail.com
https://keys.mailvelope.com/pks/lookup?op=get&search=pschneider1968@gmail.com
^ permalink raw reply [flat|nested] 282+ messages in thread* Re: [PATCH 6.6 000/267] 6.6.35-rc1 review
2024-06-19 12:52 [PATCH 6.6 000/267] 6.6.35-rc1 review Greg Kroah-Hartman
` (277 preceding siblings ...)
2024-06-20 18:57 ` Peter Schneider
@ 2024-06-20 21:35 ` Shuah Khan
278 siblings, 0 replies; 282+ messages in thread
From: Shuah Khan @ 2024-06-20 21:35 UTC (permalink / raw)
To: Greg Kroah-Hartman, stable
Cc: patches, linux-kernel, torvalds, akpm, linux, shuah, patches,
lkft-triage, pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw,
rwarsow, conor, allen.lkml, broonie, Shuah Khan
On 6/19/24 06:52, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.6.35 release.
> There are 267 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri, 21 Jun 2024 12:55:11 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.6.35-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.6.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
thanks,
-- Shuah
^ permalink raw reply [flat|nested] 282+ messages in thread