RE: [PATCH][net-next][v2] rtnetlink: instroduce vnlmsg_new and use it in rtnl_getlink

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Li,Rongqing" <lirongqing@baidu.com>
To: Yunsheng Lin <linyunsheng@huawei.com>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"edumazet@google.com" <edumazet@google.com>,
	"kuba@kernel.org" <kuba@kernel.org>,
	"pabeni@redhat.com" <pabeni@redhat.com>,
	"Liam.Howlett@oracle.com" <Liam.Howlett@oracle.com>,
	"anjali.k.kulkarni@oracle.com" <anjali.k.kulkarni@oracle.com>,
	"leon@kernel.org" <leon@kernel.org>,
	"fw@strlen.de" <fw@strlen.de>,
	"shayagr@amazon.com" <shayagr@amazon.com>,
	"idosch@nvidia.com" <idosch@nvidia.com>,
	"razor@blackwall.org" <razor@blackwall.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: RE: [PATCH][net-next][v2] rtnetlink: instroduce vnlmsg_new and use it in rtnl_getlink
Date: Tue, 14 Nov 2023 12:02:12 +0000	[thread overview]
Message-ID: <3f479dcb95c04e54b689fa96386022e0@baidu.com> (raw)
In-Reply-To: <7f60f869-ec5c-a58c-a490-80cfcdd0fda7@huawei.com>



> -----Original Message-----
> From: Yunsheng Lin <linyunsheng@huawei.com>
> Sent: Tuesday, November 14, 2023 7:32 PM
> To: Li,Rongqing <lirongqing@baidu.com>; davem@davemloft.net;
> edumazet@google.com; kuba@kernel.org; pabeni@redhat.com;
> Liam.Howlett@oracle.com; anjali.k.kulkarni@oracle.com; leon@kernel.org;
> fw@strlen.de; shayagr@amazon.com; idosch@nvidia.com;
> razor@blackwall.org; netdev@vger.kernel.org
> Subject: Re: [PATCH][net-next][v2] rtnetlink: instroduce vnlmsg_new and use it
> in rtnl_getlink
> 
> On 2023/11/14 17:55, Li RongQing wrote:
> > if a PF has 256 or more VFs, ip link command will allocate a order 3
> > memory or more, and maybe trigger OOM due to memory fragement,
> 
> fragement -> fragment?

I will fix it 
Thanks

> 
> > the VFs needed memory size is computed in rtnl_vfinfo_size.
> >
> > so instroduce vnlmsg_new which calls netlink_alloc_large_skb in which
> 
> instroduce -> introduce?

Thanks

> 
> > vmalloc is used for large memory, to avoid the failure of allocating
> > memory
> >
> >     ip invoked oom-killer:
> gfp_mask=0xc2cc0(GFP_KERNEL|__GFP_NOWARN|\
> > 	__GFP_COMP|__GFP_NOMEMALLOC), order=3, oom_score_adj=0
> >     CPU: 74 PID: 204414 Comm: ip Kdump: loaded Tainted: P
> OE
> >     Call Trace:
> >     dump_stack+0x57/0x6a
> >     dump_header+0x4a/0x210
> >     oom_kill_process+0xe4/0x140
> >     out_of_memory+0x3e8/0x790
> >     __alloc_pages_slowpath.constprop.116+0x953/0xc50
> >     __alloc_pages_nodemask+0x2af/0x310
> >     kmalloc_large_node+0x38/0xf0
> >     __kmalloc_node_track_caller+0x417/0x4d0
> >     __kmalloc_reserve.isra.61+0x2e/0x80
> >     __alloc_skb+0x82/0x1c0
> >     rtnl_getlink+0x24f/0x370
> >     rtnetlink_rcv_msg+0x12c/0x350
> >     netlink_rcv_skb+0x50/0x100
> >     netlink_unicast+0x1b2/0x280
> >     netlink_sendmsg+0x355/0x4a0
> >     sock_sendmsg+0x5b/0x60
> >     ____sys_sendmsg+0x1ea/0x250
> >     ___sys_sendmsg+0x88/0xd0
> >     __sys_sendmsg+0x5e/0xa0
> >     do_syscall_64+0x33/0x40
> >     entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >     RIP: 0033:0x7f95a65a5b70
> >
> > Cc: Yunsheng Lin <linyunsheng@huawei.com>
> > Signed-off-by: Li RongQing <lirongqing@baidu.com>
> > ---
> > diff with v1: not move netlink_alloc_large_skb to skbuff.c
> >
> >  include/linux/netlink.h  |  1 +
> >  include/net/netlink.h    | 17 +++++++++++++++++
> >  net/core/rtnetlink.c     |  2 +-
> >  net/netlink/af_netlink.c |  2 +-
> >  4 files changed, 20 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/linux/netlink.h b/include/linux/netlink.h index
> > 75d7de3..abe91ed 100644
> > --- a/include/linux/netlink.h
> > +++ b/include/linux/netlink.h
> > @@ -351,5 +351,6 @@ bool netlink_ns_capable(const struct sk_buff *skb,
> >  			struct user_namespace *ns, int cap);  bool
> netlink_capable(const
> > struct sk_buff *skb, int cap);  bool netlink_net_capable(const struct
> > sk_buff *skb, int cap);
> > +struct sk_buff *netlink_alloc_large_skb(unsigned int size, int
> > +broadcast);
> >
> >  #endif	/* __LINUX_NETLINK_H */
> > diff --git a/include/net/netlink.h b/include/net/netlink.h index
> > 83bdf78..7d31217 100644
> > --- a/include/net/netlink.h
> > +++ b/include/net/netlink.h
> > @@ -1011,6 +1011,23 @@ static inline struct sk_buff *nlmsg_new(size_t
> > payload, gfp_t flags)  }
> >
> >  /**
> > + * vnlmsg_new - Allocate a new netlink message with non-contiguous
> > + * physical memory
> > + * @payload: size of the message payload
> > + *
> > + * Use NLMSG_DEFAULT_SIZE if the size of the payload isn't known
> > + * and a good default is needed.
> > + *
> > + * The allocated skb is unable to have frag page for shinfo->frags*,
> > + * as the NULL setting for skb->head in netlink_skb_destructor() will
> > + * bypass most of the handling in skb_release_data()  */ static
> > +inline struct sk_buff *vnlmsg_new(size_t payload) {
> > +	return netlink_alloc_large_skb(nlmsg_total_size(payload), 0); }
> 
> The nlmsg_new() has the below parameters, there is no gfp flags for
> vnlmsg_new() and always assuming GFP_KERNEL?
> 

I think that vnlmsg_new is similar as vmalloc,  so no flag is needed, and always assuming GFP_KERNEL 

-Li
>  * @payload: size of the message payload
>  * @flags: the type of memory to allocate.
> 
> There are a lot of callers for nlmsg_new(), I am wondering how many of existing
> nlmsg_new() caller can change to use vnlmsg_new().
> https://elixir.free-electrons.com/linux/v6.7-rc1/A/ident/nlmsg_new
>

next prev parent reply	other threads:[~2023-11-14 12:04 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-14  9:55 [PATCH][net-next][v2] rtnetlink: instroduce vnlmsg_new and use it in rtnl_getlink Li RongQing
2023-11-14 11:31 ` Yunsheng Lin
2023-11-14 12:02   ` Li,Rongqing [this message]
2023-11-14 22:37 ` Jakub Kicinski
2023-11-15  8:16   ` Li,Rongqing

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3f479dcb95c04e54b689fa96386022e0@baidu.com \
    --to=lirongqing@baidu.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=anjali.k.kulkarni@oracle.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=fw@strlen.de \
    --cc=idosch@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linyunsheng@huawei.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=razor@blackwall.org \
    --cc=shayagr@amazon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.