From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
linux-mm@kvack.org, David Miller <davem@davemloft.net>
Subject: Re: [PATCH 9/9] net: vm deadlock avoidance core
Date: Tue, 16 Jan 2007 14:47:54 +0100 [thread overview]
Message-ID: <1168955274.22935.47.camel@twins> (raw)
In-Reply-To: <20070116132503.GA23144@2ka.mipt.ru>
On Tue, 2007-01-16 at 16:25 +0300, Evgeniy Polyakov wrote:
> On Tue, Jan 16, 2007 at 10:46:06AM +0100, Peter Zijlstra (a.p.zijlstra@chello.nl) wrote:
> > @@ -1767,10 +1767,23 @@ int netif_receive_skb(struct sk_buff *sk
> > struct net_device *orig_dev;
> > int ret = NET_RX_DROP;
> > __be16 type;
> > + unsigned long pflags = current->flags;
> > +
> > + /* Emergency skb are special, they should
> > + * - be delivered to SOCK_VMIO sockets only
> > + * - stay away from userspace
> > + * - have bounded memory usage
> > + *
> > + * Use PF_MEMALLOC as a poor mans memory pool - the grouping kind.
> > + * This saves us from propagating the allocation context down to all
> > + * allocation sites.
> > + */
> > + if (unlikely(skb->emergency))
> > + current->flags |= PF_MEMALLOC;
>
> Access to 'current' in netif_receive_skb()???
> Why do you want to work with, for example keventd?
Can this run in keventd?
I thought this was softirq context and thus this would either run in a
borrowed context or in ksoftirqd. See patch 3/9.
> > @@ -1798,6 +1811,8 @@ int netif_receive_skb(struct sk_buff *sk
> > goto ncls;
> > }
> > #endif
> > + if (unlikely(skb->emergency))
> > + goto skip_taps;
> >
> > list_for_each_entry_rcu(ptype, &ptype_all, list) {
> > if (!ptype->dev || ptype->dev == skb->dev) {
> > @@ -1807,6 +1822,7 @@ int netif_receive_skb(struct sk_buff *sk
> > }
> > }
> >
> > +skip_taps:
>
> It is still a 'tap'.
Not sure what you are saying, I thought this should stop delivery of
skbs to taps?
> > #ifdef CONFIG_NET_CLS_ACT
> > if (pt_prev) {
> > ret = deliver_skb(skb, pt_prev, orig_dev);
> > @@ -1819,15 +1835,26 @@ int netif_receive_skb(struct sk_buff *sk
> >
> > if (ret == TC_ACT_SHOT || (ret == TC_ACT_STOLEN)) {
> > kfree_skb(skb);
> > - goto out;
> > + goto unlock;
> > }
> >
> > skb->tc_verd = 0;
> > ncls:
> > #endif
> >
> > + if (unlikely(skb->emergency))
> > + switch(skb->protocol) {
> > + case __constant_htons(ETH_P_ARP):
> > + case __constant_htons(ETH_P_IP):
> > + case __constant_htons(ETH_P_IPV6):
> > + break;
>
> Poor vlans and appletalk.
Yeah and all those other too, maybe some day.
> > Index: linux-2.6-git/net/ipv4/tcp_ipv4.c
> > ===================================================================
> > --- linux-2.6-git.orig/net/ipv4/tcp_ipv4.c 2007-01-12 12:20:07.000000000 +0100
> > +++ linux-2.6-git/net/ipv4/tcp_ipv4.c 2007-01-12 12:21:14.000000000 +0100
> > @@ -1604,6 +1604,22 @@ csum_err:
> > goto discard;
> > }
> >
> > +static int tcp_v4_backlog_rcv(struct sock *sk, struct sk_buff *skb)
> > +{
> > + int ret;
> > + unsigned long pflags = current->flags;
> > + if (unlikely(skb->emergency)) {
> > + BUG_ON(!sk_has_vmio(sk)); /* we dropped those before queueing */
> > + if (!(pflags & PF_MEMALLOC))
> > + current->flags |= PF_MEMALLOC;
> > + }
> > +
> > + ret = tcp_v4_do_rcv(sk, skb);
> > +
> > + current->flags = pflags;
> > + return ret;
>
> Why don't you want to just setup PF_MEMALLOC for the socket and all
> related processes?
I'm not understanding what you're saying here.
I want grant the processing of skb->emergency packets access to the
memory reserves.
How would I set PF_MEMALLOC on a socket, its a process flag? And which
related processes?
> > +}
> > +
> > /*
> > * From tcp_input.c
> > */
> > @@ -1654,6 +1670,15 @@ int tcp_v4_rcv(struct sk_buff *skb)
> > if (!sk)
> > goto no_tcp_socket;
> >
> > + if (unlikely(skb->emergency)) {
> > + if (!sk_has_vmio(sk))
> > + goto discard_and_relse;
> > + /*
> > + decrease window size..
> > + tcp_enter_quickack_mode(sk);
> > + */
>
> How does this decrease window size?
> Maybe ack scheduling would be better handled by inet_csk_schedule_ack()
> or just directly send an ack, which in turn requires allocation, which
> can be bound to this received frame processing...
It doesn't, I thought that it might be a good idea doing that, but never
got around to actually figuring out how to do it.
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
linux-mm@kvack.org, David Miller <davem@davemloft.net>
Subject: Re: [PATCH 9/9] net: vm deadlock avoidance core
Date: Tue, 16 Jan 2007 14:47:54 +0100 [thread overview]
Message-ID: <1168955274.22935.47.camel@twins> (raw)
In-Reply-To: <20070116132503.GA23144@2ka.mipt.ru>
On Tue, 2007-01-16 at 16:25 +0300, Evgeniy Polyakov wrote:
> On Tue, Jan 16, 2007 at 10:46:06AM +0100, Peter Zijlstra (a.p.zijlstra@chello.nl) wrote:
> > @@ -1767,10 +1767,23 @@ int netif_receive_skb(struct sk_buff *sk
> > struct net_device *orig_dev;
> > int ret = NET_RX_DROP;
> > __be16 type;
> > + unsigned long pflags = current->flags;
> > +
> > + /* Emergency skb are special, they should
> > + * - be delivered to SOCK_VMIO sockets only
> > + * - stay away from userspace
> > + * - have bounded memory usage
> > + *
> > + * Use PF_MEMALLOC as a poor mans memory pool - the grouping kind.
> > + * This saves us from propagating the allocation context down to all
> > + * allocation sites.
> > + */
> > + if (unlikely(skb->emergency))
> > + current->flags |= PF_MEMALLOC;
>
> Access to 'current' in netif_receive_skb()???
> Why do you want to work with, for example keventd?
Can this run in keventd?
I thought this was softirq context and thus this would either run in a
borrowed context or in ksoftirqd. See patch 3/9.
> > @@ -1798,6 +1811,8 @@ int netif_receive_skb(struct sk_buff *sk
> > goto ncls;
> > }
> > #endif
> > + if (unlikely(skb->emergency))
> > + goto skip_taps;
> >
> > list_for_each_entry_rcu(ptype, &ptype_all, list) {
> > if (!ptype->dev || ptype->dev == skb->dev) {
> > @@ -1807,6 +1822,7 @@ int netif_receive_skb(struct sk_buff *sk
> > }
> > }
> >
> > +skip_taps:
>
> It is still a 'tap'.
Not sure what you are saying, I thought this should stop delivery of
skbs to taps?
> > #ifdef CONFIG_NET_CLS_ACT
> > if (pt_prev) {
> > ret = deliver_skb(skb, pt_prev, orig_dev);
> > @@ -1819,15 +1835,26 @@ int netif_receive_skb(struct sk_buff *sk
> >
> > if (ret == TC_ACT_SHOT || (ret == TC_ACT_STOLEN)) {
> > kfree_skb(skb);
> > - goto out;
> > + goto unlock;
> > }
> >
> > skb->tc_verd = 0;
> > ncls:
> > #endif
> >
> > + if (unlikely(skb->emergency))
> > + switch(skb->protocol) {
> > + case __constant_htons(ETH_P_ARP):
> > + case __constant_htons(ETH_P_IP):
> > + case __constant_htons(ETH_P_IPV6):
> > + break;
>
> Poor vlans and appletalk.
Yeah and all those other too, maybe some day.
> > Index: linux-2.6-git/net/ipv4/tcp_ipv4.c
> > ===================================================================
> > --- linux-2.6-git.orig/net/ipv4/tcp_ipv4.c 2007-01-12 12:20:07.000000000 +0100
> > +++ linux-2.6-git/net/ipv4/tcp_ipv4.c 2007-01-12 12:21:14.000000000 +0100
> > @@ -1604,6 +1604,22 @@ csum_err:
> > goto discard;
> > }
> >
> > +static int tcp_v4_backlog_rcv(struct sock *sk, struct sk_buff *skb)
> > +{
> > + int ret;
> > + unsigned long pflags = current->flags;
> > + if (unlikely(skb->emergency)) {
> > + BUG_ON(!sk_has_vmio(sk)); /* we dropped those before queueing */
> > + if (!(pflags & PF_MEMALLOC))
> > + current->flags |= PF_MEMALLOC;
> > + }
> > +
> > + ret = tcp_v4_do_rcv(sk, skb);
> > +
> > + current->flags = pflags;
> > + return ret;
>
> Why don't you want to just setup PF_MEMALLOC for the socket and all
> related processes?
I'm not understanding what you're saying here.
I want grant the processing of skb->emergency packets access to the
memory reserves.
How would I set PF_MEMALLOC on a socket, its a process flag? And which
related processes?
> > +}
> > +
> > /*
> > * From tcp_input.c
> > */
> > @@ -1654,6 +1670,15 @@ int tcp_v4_rcv(struct sk_buff *skb)
> > if (!sk)
> > goto no_tcp_socket;
> >
> > + if (unlikely(skb->emergency)) {
> > + if (!sk_has_vmio(sk))
> > + goto discard_and_relse;
> > + /*
> > + decrease window size..
> > + tcp_enter_quickack_mode(sk);
> > + */
>
> How does this decrease window size?
> Maybe ack scheduling would be better handled by inet_csk_schedule_ack()
> or just directly send an ack, which in turn requires allocation, which
> can be bound to this received frame processing...
It doesn't, I thought that it might be a good idea doing that, but never
got around to actually figuring out how to do it.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2007-01-16 13:50 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-01-16 9:45 [PATCH 0/9] VM deadlock avoidance -v10 Peter Zijlstra
2007-01-16 9:45 ` Peter Zijlstra
2007-01-16 9:45 ` [PATCH 1/9] mm: page allocation rank Peter Zijlstra
2007-01-16 9:45 ` Peter Zijlstra
2007-01-16 9:45 ` [PATCH 2/9] mm: slab allocation fairness Peter Zijlstra
2007-01-16 9:45 ` Peter Zijlstra
2007-01-16 9:46 ` [PATCH 3/9] mm: allow PF_MEMALLOC from softirq context Peter Zijlstra
2007-01-16 9:46 ` Peter Zijlstra
2007-01-16 9:46 ` [PATCH 4/9] mm: serialize access to min_free_kbytes Peter Zijlstra
2007-01-16 9:46 ` Peter Zijlstra
2007-01-16 9:46 ` [PATCH 5/9] mm: emergency pool Peter Zijlstra
2007-01-16 9:46 ` Peter Zijlstra
2007-01-16 9:46 ` [PATCH 6/9] mm: __GFP_EMERGENCY Peter Zijlstra
2007-01-16 9:46 ` Peter Zijlstra
2007-01-16 9:46 ` [PATCH 7/9] mm: allow mempool to fall back to memalloc reserves Peter Zijlstra
2007-01-16 9:46 ` Peter Zijlstra
2007-01-16 9:46 ` [PATCH 8/9] slab: kmem_cache_objs_to_pages() Peter Zijlstra
2007-01-16 9:46 ` Peter Zijlstra
2007-01-16 9:46 ` [PATCH 9/9] net: vm deadlock avoidance core Peter Zijlstra
2007-01-16 9:46 ` Peter Zijlstra
2007-01-16 13:25 ` Evgeniy Polyakov
2007-01-16 13:25 ` Evgeniy Polyakov
2007-01-16 13:47 ` Peter Zijlstra [this message]
2007-01-16 13:47 ` Peter Zijlstra
2007-01-16 15:33 ` Evgeniy Polyakov
2007-01-16 15:33 ` Evgeniy Polyakov
2007-01-16 16:08 ` Peter Zijlstra
2007-01-16 16:08 ` Peter Zijlstra
2007-01-17 4:54 ` Evgeniy Polyakov
2007-01-17 4:54 ` Evgeniy Polyakov
2007-01-17 9:07 ` Peter Zijlstra
2007-01-17 9:07 ` Peter Zijlstra
2007-01-18 10:41 ` Evgeniy Polyakov
2007-01-18 10:41 ` Evgeniy Polyakov
2007-01-18 12:18 ` Peter Zijlstra
2007-01-18 12:18 ` Peter Zijlstra
2007-01-18 13:58 ` Possible ways of dealing with OOM conditions Evgeniy Polyakov
2007-01-18 13:58 ` Evgeniy Polyakov
2007-01-18 15:10 ` Peter Zijlstra
2007-01-18 15:10 ` Peter Zijlstra
2007-01-18 15:50 ` Evgeniy Polyakov
2007-01-18 15:50 ` Evgeniy Polyakov
2007-01-18 17:31 ` Peter Zijlstra
2007-01-18 17:31 ` Peter Zijlstra
2007-01-18 18:34 ` Evgeniy Polyakov
2007-01-18 18:34 ` Evgeniy Polyakov
2007-01-19 12:53 ` Peter Zijlstra
2007-01-19 12:53 ` Peter Zijlstra
2007-01-19 22:56 ` Evgeniy Polyakov
2007-01-19 22:56 ` Evgeniy Polyakov
2007-01-20 22:36 ` Rik van Riel
2007-01-20 22:36 ` Rik van Riel
2007-01-21 1:46 ` Evgeniy Polyakov
2007-01-21 1:46 ` Evgeniy Polyakov
2007-01-21 2:14 ` Evgeniy Polyakov
2007-01-21 2:14 ` Evgeniy Polyakov
2007-01-21 16:30 ` Rik van Riel
2007-01-21 16:30 ` Rik van Riel
2007-01-19 17:54 ` Christoph Lameter
2007-01-19 17:54 ` Christoph Lameter
2007-01-17 9:12 ` [PATCH 0/9] VM deadlock avoidance -v10 Pavel Machek
2007-01-17 9:12 ` Pavel Machek
2007-01-17 9:20 ` Peter Zijlstra
2007-01-17 9:20 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1168955274.22935.47.camel@twins \
--to=a.p.zijlstra@chello.nl \
--cc=davem@davemloft.net \
--cc=johnpol@2ka.mipt.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.