All of lore.kernel.org
 help / color / mirror / Atom feed
From: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
To: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	KAMEZAWA Hiroyuki
	<kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>,
	Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
Subject: Re: [PATCH] introduce res_counter_charge_nofail() for socket allocations
Date: Thu, 19 Jan 2012 17:13:05 +0400	[thread overview]
Message-ID: <4F1816E1.2050506@parallels.com> (raw)
In-Reply-To: <20120119131217.GN24386-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

On 01/19/2012 05:12 PM, Johannes Weiner wrote:
> On Thu, Jan 19, 2012 at 04:51:16PM +0400, Glauber Costa wrote:
>> On 01/19/2012 04:48 PM, Johannes Weiner wrote:
>>> On Wed, Jan 18, 2012 at 07:15:58PM +0400, Glauber Costa wrote:
>>>> There is a case in __sk_mem_schedule(), where an allocation
>>>> is beyond the maximum, but yet we are allowed to proceed.
>>>> It happens under the following condition:
>>>>
>>>> 	sk->sk_wmem_queued + size>= sk->sk_sndbuf
>>>>
>>>> The network code won't revert the allocation in this case,
>>>> meaning that at some point later it'll try to do it. Since
>>>> this is never communicated to the underlying res_counter
>>>> code, there is an inbalance in res_counter uncharge operation.
>>>>
>>>> I see two ways of fixing this:
>>>>
>>>> 1) storing the information about those allocations somewhere
>>>>     in memcg, and then deducting from that first, before
>>>>     we start draining the res_counter,
>>>> 2) providing a slightly different allocation function for
>>>>     the res_counter, that matches the original behavior of
>>>>     the network code more closely.
>>>>
>>>> I decided to go for #2 here, believing it to be more elegant,
>>>> since #1 would require us to do basically that, but in a more
>>>> obscure way.
>>>>
>>>> I will eventually submit it through Dave for the -net tree,
>>>> but I wanted to query you guys first, to see if this approach
>>>> is acceptable or if you'd prefer me to try something else.
>>>>
>>>> Thanks
>>>>
>>>> Signed-off-by: Glauber Costa<glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
>>>> Cc: KAMEZAWA Hiroyuki<kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
>>>> Cc: Johannes Weiner<hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
>>>> Cc: Michal Hocko<mhocko-AlSwsSmVLrQ@public.gmane.org>
>>>> ---
>>>>   include/linux/res_counter.h |    6 ++++++
>>>>   include/net/sock.h          |   10 ++++------
>>>>   kernel/res_counter.c        |   25 +++++++++++++++++++++++++
>>>>   net/core/sock.c             |    4 ++--
>>>>   4 files changed, 37 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/include/linux/res_counter.h b/include/linux/res_counter.h
>>>> index c9d625c..32a7b02 100644
>>>> --- a/include/linux/res_counter.h
>>>> +++ b/include/linux/res_counter.h
>>>> @@ -109,12 +109,18 @@ void res_counter_init(struct res_counter *counter, struct res_counter *parent);
>>>>    *
>>>>    * returns 0 on success and<0 if the counter->usage will exceed the
>>>>    * counter->limit _locked call expects the counter->lock to be taken
>>>> + *
>>>> + * charge_nofail works the same, except that it charges the resource
>>>> + * counter unconditionally, and returns<   0 if the after the current
>>>> + * charge we are over limit.
>>>>    */
>>>
>>> res_counter_margin() assumes usage<= limit is always true.  Just make
>>> sure you return 0 if that is not the case, or the charge path can get
>>> confused, thinking there is enough room and retry needlessly.
>>>
>>> Otherwise, looks good.
>>
>> You mean return zero in res_counter_charge_fail() if we exceeded the limit?
>>
>> I do that, since one needs to know the allocation was supposed to fail.
>> Or are you talking about something else ?
>
> Yes, I mean the calculation in res_counter_margin(), which is supposed
> to tell the margin for new charges.  Its current code will underflow
> and return something large when the usage exceeds the limit, which is
> not possible before your patch, so I think you need to include this:
>
> diff --git a/include/linux/res_counter.h b/include/linux/res_counter.h
> index c9d625c..d06d014 100644
> --- a/include/linux/res_counter.h
> +++ b/include/linux/res_counter.h
> @@ -142,7 +142,10 @@ static inline unsigned long long res_counter_margin(struct res_counter *cnt)
>   	unsigned long flags;
>
>   	spin_lock_irqsave(&cnt->lock, flags);
> -	margin = cnt->limit - cnt->usage;
> +	if (cnt->limit>  cnt->usage)
> +		margin = cnt->limit - cnt->usage;
> +	else
> +		margin = 0;
>   	spin_unlock_irqrestore(&cnt->lock, flags);
>   	return margin;
>   }
Ah, I see.

Okay, I will update my patch to include this. (Or would you think it 
would be better to add this as a preparation patch?)

  parent reply	other threads:[~2012-01-19 13:13 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-18 15:15 [PATCH] introduce res_counter_charge_nofail() for socket allocations Glauber Costa
     [not found] ` <1326899758-9013-1-git-send-email-glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-01-19 12:48   ` Johannes Weiner
     [not found]     ` <20120119124841.GL24386-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2012-01-19 12:51       ` Glauber Costa
     [not found]         ` <4F1811C4.50204-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-01-19 13:12           ` Johannes Weiner
     [not found]             ` <20120119131217.GN24386-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2012-01-19 13:13               ` Glauber Costa [this message]
     [not found]                 ` <4F1816E1.2050506-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-01-19 13:38                   ` Johannes Weiner
2012-01-19 14:36   ` Michal Hocko
     [not found]     ` <20120119143607.GF13932-VqjxzfR4DlwKmadIfiO5sKVXKuFTiq87@public.gmane.org>
2012-01-19 20:57       ` James Bottomley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F1816E1.2050506@parallels.com \
    --to=glommer-bzqdu9zft3wakbo8gow8eq@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
    --cc=kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org \
    --cc=mhocko-AlSwsSmVLrQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.