From: Hyunwoo Kim <v4bel@theori.io>
To: Eric Dumazet <edumazet@google.com>
Cc: Taehee Yoo <ap420073@gmail.com>,
"David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Dmitry Kozlov <xeb@mail.ru>, David Ahern <dsahern@kernel.org>,
tudordana@google.com, netdev@vger.kernel.org, imv4bel@gmail.com,
v4bel@theori.io
Subject: Re: [PATCH] net: Fix invalid ip_route_output_ports() call
Date: Mon, 20 Mar 2023 22:08:03 -0700 [thread overview]
Message-ID: <20230321050803.GA22060@ubuntu> (raw)
In-Reply-To: <CANn89i+=-BTZyhg9f=Vyz0rws1Z-1O-F5TkESBjkZnKmHeKz1g@mail.gmail.com>
On Mon, Mar 20, 2023 at 08:17:15PM -0700, Eric Dumazet wrote:
> On Mon, Mar 20, 2023 at 7:49 PM Hyunwoo Kim <v4bel@theori.io> wrote:
> >
> > If you pass the address of the struct flowi4 you declared as a
> > local variable as the fl4 argument to ip_route_output_ports(),
> > the subsequent call to xfrm_state_find() will read the local
> > variable by AF_INET6 rather than AF_INET as per policy,
> > which could cause it to go out of scope on the kernel stack.
> >
> > Reported-by: syzbot+ada7c035554bcee65580@syzkaller.appspotmail.com
>
> I could not find this syzbot issue, can you provide a link, and a stack trace ?
This is the syzbot dashboard:
https://syzkaller.appspot.com/bug?extid=0f526bf9663842ac2dc7
and KASAN log:
```
[ 239.016529] ==================================================================
[ 239.016540] BUG: KASAN: stack-out-of-bounds in xfrm_state_find+0x2b8/0x2ae0
[ 239.016556] Read of size 4 at addr ffffc90000860ba0 by task swapper/14/0
[ 239.016571] CPU: 14 PID: 0 Comm: swapper/14 Tainted: G D E 6.2.0+ #14
[ 239.016583] Hardware name: Gigabyte Technology Co., Ltd. B460MDS3H/B460M DS3H, BIOS F3 05/27/2020
[ 239.016593] Call Trace:
[ 239.016599] <IRQ>
[ 239.016605] dump_stack_lvl+0x4c/0x70
[ 239.016617] print_report+0xcf/0x620
[ 239.016629] ? kasan_addr_to_slab+0x11/0xb0
[ 239.016639] ? xfrm_state_find+0x2b8/0x2ae0
[ 239.016650] kasan_report+0xbf/0x100
[ 239.016657] ? xfrm_state_find+0x2b8/0x2ae0
[ 239.016664] __asan_load4+0x84/0xa0
[ 239.016670] xfrm_state_find+0x2b8/0x2ae0
[ 239.016677] ? __pfx_xfrm_state_find+0x10/0x10
[ 239.016684] ? __pfx_xfrm4_get_saddr+0x10/0x10
[ 239.016690] ? unwind_next_frame+0x27/0x40
[ 239.016697] xfrm_tmpl_resolve+0x1f9/0x780
[ 239.016704] ? __pfx_xfrm_tmpl_resolve+0x10/0x10
[ 239.016711] ? kasan_save_stack+0x3e/0x50
[ 239.016716] ? kasan_save_stack+0x2a/0x50
[ 239.016721] ? kasan_set_track+0x29/0x40
[ 239.016726] ? kasan_save_alloc_info+0x22/0x30
[ 239.016731] ? __kasan_slab_alloc+0x91/0xa0
[ 239.016736] ? kmem_cache_alloc+0x17e/0x370
[ 239.016741] ? dst_alloc+0x5c/0x230
[ 239.016747] ? xfrm_pol_bin_cmp+0xc8/0xe0
[ 239.016753] xfrm_resolve_and_create_bundle+0xf1/0x10d0
[ 239.016758] ? __pfx_xfrm_policy_inexact_lookup_rcu+0x10/0x10
[ 239.016764] ? xfrm_policy_lookup_inexact_addr+0xa1/0xc0
[ 239.016771] ? xfrm_policy_match+0xd6/0x110
[ 239.016776] ? __rcu_read_unlock+0x3b/0x80
[ 239.016782] ? xfrm_policy_lookup_bytype.constprop.0+0x52e/0xb80
[ 239.016788] ? __pfx_xfrm_resolve_and_create_bundle+0x10/0x10
[ 239.016795] ? __pfx_xfrm_policy_lookup_bytype.constprop.0+0x10/0x10
[ 239.016802] ? __kasan_check_write+0x18/0x20
[ 239.016807] ? _raw_spin_lock_bh+0x8c/0xe0
[ 239.016813] ? __pfx__raw_spin_lock_bh+0x10/0x10
[ 239.016819] xfrm_lookup_with_ifid+0x2f2/0xe50
[ 239.016824] ? __local_bh_enable_ip+0x3f/0x90
[ 239.016830] ? rcu_gp_cleanup+0x2f2/0x6c0
[ 239.016836] ? __pfx_xfrm_lookup_with_ifid+0x10/0x10
[ 239.016842] ? ip_route_output_key_hash_rcu+0x3da/0x1000
[ 239.016848] xfrm_lookup_route+0x2a/0x100
[ 239.016854] ip_route_output_flow+0x1a7/0x1c0
[ 239.016859] ? __pfx_ip_route_output_flow+0x10/0x10
[ 239.016866] igmpv3_newpack+0x1c1/0x5d0
[ 239.016872] ? __pfx_igmpv3_newpack+0x10/0x10
[ 239.016878] ? check_preempt_curr+0xd7/0x120
[ 239.016885] add_grhead+0x111/0x130
[ 239.016891] ? __pfx_igmp_ifc_timer_expire+0x10/0x10
[ 239.016897] add_grec+0x756/0x7f0
[ 239.016903] ? __pfx_add_grec+0x10/0x10
[ 239.016909] ? __pfx__raw_spin_lock_bh+0x10/0x10
[ 239.016914] ? wake_up_process+0x19/0x20
[ 239.016919] ? insert_work+0x130/0x160
[ 239.016926] ? __pfx_igmp_ifc_timer_expire+0x10/0x10
[ 239.016932] igmp_ifc_timer_expire+0x2b5/0x650
[ 239.016938] ? __kasan_check_write+0x18/0x20
[ 239.016943] ? _raw_spin_lock+0x8c/0xe0
[ 239.016948] ? __pfx_igmp_ifc_timer_expire+0x10/0x10
[ 239.016954] ? __pfx_igmp_ifc_timer_expire+0x10/0x10
[ 239.016960] call_timer_fn+0x2d/0x1b0
[ 239.016966] ? __pfx_igmp_ifc_timer_expire+0x10/0x10
[ 239.016973] __run_timers.part.0+0x447/0x530
[ 239.016979] ? __pfx___run_timers.part.0+0x10/0x10
[ 239.016986] ? ktime_get+0x58/0xd0
[ 239.016992] ? lapic_next_deadline+0x30/0x40
[ 239.016997] ? clockevents_program_event+0x118/0x1a0
[ 239.017004] run_timer_softirq+0x69/0xf0
[ 239.017010] __do_softirq+0x128/0x444
[ 239.017016] __irq_exit_rcu+0xdd/0x130
[ 239.017022] irq_exit_rcu+0x12/0x20
[ 239.017027] sysvec_apic_timer_interrupt+0xa5/0xc0
[ 239.017033] </IRQ>
[ 239.017036] <TASK>
[ 239.017039] asm_sysvec_apic_timer_interrupt+0x1f/0x30
[ 239.017045] RIP: 0010:cpuidle_enter_state+0x1d2/0x510
[ 239.017051] Code: 30 00 0f 84 61 01 00 00 41 83 ed 01 73 dd 48 83 c4 28 44 89 f0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc fb 0f 1f 44 00 00 <45> 85 f6 0f 89 18 ff ff ff 48 c7 43 18 00 00 00 00 49 83 fd 09 0f
[ 239.017061] RSP: 0018:ffffc9000028fd40 EFLAGS: 00000246
[ 239.017067] RAX: 0000000000000000 RBX: ffffe8ffffd00c40 RCX: ffffffff811eae9c
[ 239.017072] RDX: dffffc0000000000 RSI: ffffffff82e7cc00 RDI: ffff88840eb44188
[ 239.017077] RBP: ffffc9000028fd90 R08: 00000037a67e1a91 R09: ffff88840eb3f0a3
[ 239.017082] R10: ffffed1081d67e14 R11: 0000000000000001 R12: ffffffff846c43a0
[ 239.017087] R13: 0000000000000002 R14: 0000000000000002 R15: ffffffff846c4488
[ 239.017093] ? sched_idle_set_state+0x4c/0x70
[ 239.017101] cpuidle_enter+0x45/0x70
[ 239.017108] call_cpuidle+0x44/0x80
[ 239.017113] do_idle+0x2b4/0x340
[ 239.017118] ? __pfx_do_idle+0x10/0x10
[ 239.017125] cpu_startup_entry+0x24/0x30
[ 239.017130] start_secondary+0x1d6/0x210
[ 239.017136] ? __pfx_start_secondary+0x10/0x10
[ 239.017142] ? set_bringup_idt_handler.constprop.0+0x93/0xc0
[ 239.017150] ? start_cpu0+0xc/0xc
[ 239.017156] secondary_startup_64_no_verify+0xe5/0xeb
[ 239.017163] </TASK>
[ 239.017169] The buggy address belongs to the virtual mapping at
[ffffc90000859000, ffffc90000862000) created by:
irq_init_percpu_irqstack+0x1c8/0x270
[ 239.017182] The buggy address belongs to the physical page:
[ 239.017186] page:00000000d2ae7732 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x40eb09
[ 239.017193] flags: 0x17ffffc0001000(reserved|node=0|zone=2|lastcpupid=0x1fffff)
[ 239.017202] raw: 0017ffffc0001000 ffffea00103ac248 ffffea00103ac248 0000000000000000
[ 239.017208] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
[ 239.017213] page dumped because: kasan: bad access detected
[ 239.017219] Memory state around the buggy address:
[ 239.017222] ffffc90000860a80: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
[ 239.017228] ffffc90000860b00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
[ 239.017233] >ffffc90000860b80: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
[ 239.017238] ^
[ 239.017241] ffffc90000860c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 239.017247] ffffc90000860c80: 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f3 f3 f3 00
[ 239.017251] ==================================================================
```
>
> > Signed-off-by: Hyunwoo Kim <v4bel@theori.io>
> > ---
>
> I find this patch quite strange, to be honest.
>
> It looks like some xfrm bug to me.
>
> A stack trace would be helpful.
Here are the root caouses I analyzed:
```
igmp_ifc_timer_expire() -> igmpv3_send_cr() -> add_grec() -> add_grhead() -> igmpv3_newpack()[1] -> ip_route_output_ports() ->
ip_route_output_flow()[3] -> xfrm_lookup_route() -> xfrm_lookup() -> xfrm_lookup_with_ifid() -> xfrm_resolve_and_create_bundle() ->
xfrm_tmpl_resolve() -> xfrm_tmpl_resolve_one() -> xfrm_state_find()[5]
```
Here, igmpv3_newpack()[1] declares the variable `struct flowi4 fl4`[2]:
```
static struct sk_buff *igmpv3_newpack(struct net_device *dev, unsigned int mtu)
{
struct sk_buff *skb;
struct rtable *rt;
struct iphdr *pip;
struct igmpv3_report *pig;
struct net *net = dev_net(dev);
struct flowi4 fl4; // <===[2]
int hlen = LL_RESERVED_SPACE(dev);
int tlen = dev->needed_tailroom;
unsigned int size = mtu;
while (1) {
skb = alloc_skb(size + hlen + tlen,
GFP_ATOMIC | __GFP_NOWARN);
if (skb)
break;
size >>= 1;
if (size < 256)
return NULL;
}
skb->priority = TC_PRIO_CONTROL;
rt = ip_route_output_ports(net, &fl4, NULL, IGMPV3_ALL_MCR, 0,
0, 0,
IPPROTO_IGMP, 0, dev->ifindex);
```
Then, starting with ip_route_output_flow()[3], we use flowi4_to_flowi() to convert the `struct flowi4` variable to a `struct flowi` pointer[4]:
```
struct flowi {
union {
struct flowi_common __fl_common;
struct flowi4 ip4;
struct flowi6 ip6;
} u;
#define flowi_oif u.__fl_common.flowic_oif
#define flowi_iif u.__fl_common.flowic_iif
#define flowi_l3mdev u.__fl_common.flowic_l3mdev
#define flowi_mark u.__fl_common.flowic_mark
#define flowi_tos u.__fl_common.flowic_tos
#define flowi_scope u.__fl_common.flowic_scope
#define flowi_proto u.__fl_common.flowic_proto
#define flowi_flags u.__fl_common.flowic_flags
#define flowi_secid u.__fl_common.flowic_secid
#define flowi_tun_key u.__fl_common.flowic_tun_key
#define flowi_uid u.__fl_common.flowic_uid
} __attribute__((__aligned__(BITS_PER_LONG/8)));
static inline struct flowi *flowi4_to_flowi(struct flowi4 *fl4)
{
return container_of(fl4, struct flowi, u.ip4);
}
struct rtable *ip_route_output_flow(struct net *net, struct flowi4 *flp4,
const struct sock *sk)
{
struct rtable *rt = __ip_route_output_key(net, flp4);
if (IS_ERR(rt))
return rt;
if (flp4->flowi4_proto) {
flp4->flowi4_oif = rt->dst.dev->ifindex;
rt = (struct rtable *)xfrm_lookup_route(net, &rt->dst,
flowi4_to_flowi(flp4), // <===[4]
sk, 0);
}
return rt;
}
EXPORT_SYMBOL_GPL(ip_route_output_flow);
```
This is the cause of the stack OOB. Because we calculated the struct flowi pointer address based on struct flowi4 declared as a stack variable,
if we accessed a member of flowi that exceeds the size of flowi4, we would get an OOB.
Finally, xfrm_state_find()[5] uses daddr, which is a pointer to `&fl->u.ip4.saddr`.
Here, the encap_family variable can be entered by the user using the netlink socket.
If the user chose AF_INET6 instead of AF_INET, the xfrm_dst_hash() function would be called on an AF_INET6 basis[6],
which could cause an OOB in the `struct flowi4 fl4` variable of igmpv3_newpack()[2].
```
struct xfrm_state *
xfrm_state_find(const xfrm_address_t *daddr, const xfrm_address_t *saddr,
const struct flowi *fl, struct xfrm_tmpl *tmpl,
struct xfrm_policy *pol, int *err,
unsigned short family, u32 if_id)
{
static xfrm_address_t saddr_wildcard = { };
struct net *net = xp_net(pol);
unsigned int h, h_wildcard;
struct xfrm_state *x, *x0, *to_put;
int acquire_in_progress = 0;
int error = 0;
struct xfrm_state *best = NULL;
u32 mark = pol->mark.v & pol->mark.m;
unsigned short encap_family = tmpl->encap_family;
unsigned int sequence;
struct km_event c;
to_put = NULL;
sequence = read_seqcount_begin(&net->xfrm.xfrm_state_hash_generation);
rcu_read_lock();
h = xfrm_dst_hash(net, daddr, saddr, tmpl->reqid, encap_family); // <===[6]
```
Of course, it's possible that the patch I wrote is not appropriate.
Regards,
Hyunwoo Kim
next prev parent reply other threads:[~2023-03-21 5:10 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-21 2:49 [PATCH] net: Fix invalid ip_route_output_ports() call Hyunwoo Kim
2023-03-21 3:17 ` Eric Dumazet
2023-03-21 5:08 ` Hyunwoo Kim [this message]
2023-03-21 5:19 ` Eric Dumazet
2023-03-21 10:52 ` Steffen Klassert
2023-03-21 11:14 ` Hyunwoo Kim
2023-03-21 11:19 ` Eric Dumazet
2023-03-21 11:35 ` Hyunwoo Kim
2023-03-24 9:57 ` Steffen Klassert
2023-03-30 7:42 ` Tudor Ambarus
2023-03-30 7:56 ` Steffen Klassert
2023-03-21 11:36 ` Steffen Klassert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230321050803.GA22060@ubuntu \
--to=v4bel@theori.io \
--cc=ap420073@gmail.com \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=imv4bel@gmail.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=tudordana@google.com \
--cc=xeb@mail.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.