Re: [net-next RFC v2] net_cls: traffic counter based on classification control cgroup

cgroups.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Alexey Perevalov <a.perevalov-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
To: Daniel Wagner <wagi-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>
Cc: Glauber Costa <glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>,
	netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [net-next RFC v2] net_cls: traffic counter based on classification control cgroup
Date: Wed, 28 Nov 2012 09:21:24 +0400	[thread overview]
Message-ID: <50B59F54.8080401@samsung.com> (raw)
In-Reply-To: <50B4B9E2.4030200-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>

On 11/27/2012 05:02 PM, Daniel Wagner wrote:
> Hi Alexey,
>
> On 27.11.2012 12:03, Glauber Costa wrote:
>> On 11/27/2012 02:56 PM, Alexey Perevalov wrote:
>>> Hello.
>>>
>>> It's second version of patch I already sent to netdev.
>>>
>>> The main goal of this patch it's counting traffic for process placed to
>>> net_cls cgroup (ingress and egress).
>>> It's based on res_counters and holds counter per network interfaces.
>>>
>>> Description of patch.
>>> It handles packets in net/core/dev.c for egress and in
>>> /net/ipv4/tcp.c|udp.c for ingress.
>>> These places were chosen because we need to know also network interface.
>>>
>>> Cgroup fs interface provides following files additional to existing
>>> net_cls files:
>>> net_cls.ifacename.usage_in_bytes
>>> Containing rcv/snd lines.
>>> Also this patch adds to net_cls ability to handle a network device
>>> registration.
>>>
>>> It could be included or excluded in compile time.
>>> I moved the menu entry for "Control group classifier" from network/QoS to
>>> General Option/Control Group.
>>>
>>> I'm waiting for you comments.
>>>
>> Daniel Wagner is working on something a lot similar.
> Yes, basically what I try to do is explained by this excellent article
>
> https://lwn.net/Articles/523058/
I read articles and agreed with aspects.
But problem of selecting preferred network for application can be solved 
using netprio cgroup.


> The short version: Per application routing and statistics.
>
> I have two PoC implementation doing this. Both implementation have the same key
> idea which is to set SO_MARK per application. The routing and statistics would
> then be done by a bunch iptables rules.
>
> In the first implementation extends net_cls to set SO_MARK:
>
> void sock_update_classid(struct sock *sk, struct task_struct *task)
>   {
>          u32 classid;
> +       u32 mark;
>   
>          classid = task_cls_classid(task);
>          if (classid != sk->sk_classid)
>                  sk->sk_classid = classid;
> +
> +       mark = task_cls_mark(task);
> +       if (mark != sk->sk_mark)
> +               sk->sk_mark = mark;
>   }
>
> The second implementation is adding a new iptables matcher which matches
> on LSM contexts. Then you can do something like this:
>
> iptables -t mangle -A OUTPUT -m secmark --secctx unconfined_u:unconfined_r:foo_t:s0-s0:c0.c1023 -j MARK --set-mark 200
As I understand in LSM context it works for egress and ingress.

>> Maybe you should be in contact, in case you are not yet.
>>
>> A few general comments:
>> 1) res_counters are incredibly expensive. If you are more interested in
>> counting than you are in limiting, they may not be your best choice.
You right, I have a plan now to limit traffic too here.
Or as a variant in QoS in this case atomic is better here.

>> 2) When Daniel exposed his use case to me, it gave me the impression
>> that "counting traffic" is something that is totally doable by having a
>> dedicated interface in a separate namespace. Basically, we already count
>> traffic (rx and tx) for all interfaces anyway, so it suggests that it
>> could be an interesting way to see the problem.
> Moving applications into separate net namespaces is for sure a valid solution.
> Though there is a one drawback in this approach. The namespaces need to be
> attached to a bridge and then some NATting. That means every application
> would get it's own IP address. This might be okay for your certain use
> cases but I am still trying to work around this. Glauber and I had some
> discussion about this and he suggested to allow the physical networking
> device to be attached to several namespaces (e.g. via macvlan). Every
> namespace would get the same IP address. Unfortunately, this would result in
> the same mess as several physical devices on a network get the same
> IP address assigned.
Is I truly understand what to make statistics works we need to put 
process to separate namespace?
Approach to keep counter in cgroup hasn't such side effects, but it has 
another ).

>   
>
>> AFAIK, Daniel is still measuring this. But it would be great to know if
>> that could work for your use case as well.
> I have not started to measure :(
>
> cheers,
> daniel
>
>
Thank you Daniel and Glauber!

-- 
BR
Alexey

next prev parent reply	other threads:[~2012-11-28  5:21 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-27 10:56 [net-next RFC v2] net_cls: traffic counter based on classification control cgroup Alexey Perevalov
2012-11-27 11:03 ` Glauber Costa
     [not found]   ` <50B49DEA.7010000-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2012-11-27 13:02     ` Daniel Wagner
     [not found]       ` <50B4B9E2.4030200-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>
2012-11-28  5:21         ` Alexey Perevalov [this message]
     [not found]           ` <50B59F54.8080401-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
2012-11-28  8:09             ` Daniel Wagner
     [not found]               ` <50B5C6AB.6040208-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org>
2012-11-28 11:18                 ` Alexey Perevalov
     [not found]                   ` <50B5F2EE.6050204-Sze3O3UU22JBDgjK7y7TUQ@public.gmane.org>
2012-11-29 14:36                     ` Daniel Wagner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50B59F54.8080401@samsung.com \
    --to=a.perevalov-sze3o3uu22jbdgjk7y7tuq@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=glommer-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
    --cc=netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=wagi-kQCPcA+X3s7YtjvyW6yDsg@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).