From: Stephen Hemminger <shemminger@vyatta.com>
To: Gautam Kachroo <gk@aristanetworks.com>
Cc: Patrick McHardy <kaber@trash.net>, netdev@vger.kernel.org
Subject: Re: [PATCH] iproute2 flush: handle larger tables and deleted entries
Date: Wed, 15 Jul 2009 12:19:07 -0700 [thread overview]
Message-ID: <20090715121907.04b7f5b0@s6510> (raw)
In-Reply-To: <4e0db5bc0907151050w56529bffh9878b99cc2fdaae5@mail.gmail.com>
On Wed, 15 Jul 2009 10:50:57 -0700
Gautam Kachroo <gk@aristanetworks.com> wrote:
> On Wed, Jul 15, 2009 at 8:19 AM, Patrick McHardy<kaber@trash.net> wrote:
> > Gautam Kachroo wrote:
> >> On Tue, Jul 14, 2009 at 2:38 AM, Patrick McHardy<kaber@trash.net> wrote:
> >>> Gautam Kachroo wrote:
> >>>> use a new netlink socket when sending flush messages to avoid reading
> >>>> any pending data on the existing netlink socket.
> >>>>
> >>>> read all of the response from the netlink request -- this response can
> >>>> be split over multiple recv calls, pretty much one per netlink request
> >>>> message. ENOENT errors, which correspond to attempts to delete an
> >>>> already deleted entry, are ignored. Other errors are not ignored.
> >>>
> >>> In which case would there be any pending data? From what I can see,
> >>> this can only happen when using batching, but in that case the
> >>> previous command should continue reading until it has received all
> >>> responses (which the netlink functions appear to be doing properly).
> >>
> >> What is the "previous command"?
> >
> > The last command before the one executing when using batching.
>
> This is independent of batching (I assume you're referring to the
> -batch option to the ip command).
> It happens when running a command like "ip neigh flush to 0.0.0.0/0"
> if there are many neighbor entries.
>
> The implementation of flush commands, e.g. ip neigh flush, sends a
> dump request, e.g. RTM_GETNEIGH, and then sends requests, e.g.
> RTM_DELNEIGH, *while* there can be unread data from the dump request.
> There would be unread data if the response to the dump request was
> split over multiple calls to recvmsg.
>
> >> Are you referring to rtnl_dump_filter? If rtnl_send_check comes across
> >> a failure, rtnl_dump_filter will not continue reading.
> >>
> >> Here's the situation that I'm referring to:
> >>
> >> If rtnl_send_check detects an error, it returns -1. rtnl_send_check is
> >> called from flush_update. The multiple implementations of flush_update
> >> (e.g. in ipneigh.c, ipaddress.c) propagate this return value to their
> >> caller, e.g. print_neigh or print_addrinfo.
> >>
> >> print_neigh, print_addrinfo, etc. are called from rtnl_dump_filter.
> >> rtnl_dump_filter sits in a loop calling recvmsg on the netlink socket.
> >> However, it returns the error value if the filter function (e.g.
> >> print_neigh) returns an error. In this case, rtnl_dump_filter can
> >> return before it's read all the responses.
> >> The error return from rtnl_dump_filter causes the program to exit.
> >
> > Yes, and I agree with your patch so far. My question is why you
> > need another socket.
> >
> >> use a new netlink socket when sending flush messages to avoid reading
> >> any pending data on the existing netlink socket.
> >
> > Under what circumstances would there be pending data when
> > performing a new iproute operation?
>
> As above, it's not that there is pending data when performing a new
> iproute operation, it's that there can be pending data while
> performing a single iproute operation, namely ip <object> flush.
> The benefit of a new socket is that it won't have any data from the
> dump request waiting for it.
I posted a better fix (using MSG_PEEK).
next prev parent reply other threads:[~2009-07-15 19:19 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-13 16:39 [PATCH] iproute2 flush: handle larger tables and deleted entries Gautam Kachroo
2009-07-14 9:38 ` Patrick McHardy
2009-07-14 16:45 ` Gautam Kachroo
2009-07-15 15:19 ` Patrick McHardy
2009-07-15 17:50 ` Gautam Kachroo
2009-07-15 19:19 ` Stephen Hemminger [this message]
2009-07-15 22:04 ` Gautam Kachroo
2009-08-21 0:08 ` Gautam Kachroo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090715121907.04b7f5b0@s6510 \
--to=shemminger@vyatta.com \
--cc=gk@aristanetworks.com \
--cc=kaber@trash.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).