From: Yuval Shaia <yuval.shaia@oracle.com>
To: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] hw/rdma: Add support for GID state changes for non-qmp frameworks
Date: Sun, 26 May 2019 09:41:45 +0300 [thread overview]
Message-ID: <20190526064144.GA4309@lap1> (raw)
In-Reply-To: <87a35658-a636-4598-c860-cc73288922e2@gmail.com>
On Fri, May 24, 2019 at 08:24:30AM +0300, Marcel Apfelbaum wrote:
>
> Hi Yuval,
>
> On 5/5/19 1:55 PM, Yuval Shaia wrote:
> > Any GID change in guest must be propogate to host. This is already done
> > by firing QMP event to managment system such as libvirt which in turn
> > will update the host with the relevant change.
>
> Agreed, *any* management software can do that.
>
> >
> > When qemu is executed on non-qmp framework (ex from command-line) we
> > need to update the host instead.
> > Fix it by adding support to update the RoCE device's Ethernet function
> > IP list from qemu via netlink.
>
> I am not sure this is the right approach. I don't think QEMU should actively
> change
> the host network configuration.
> As you pointed out yourself, the management software should make such
> changes.
I know about few deployments that are not using any management software,
they fires their VMs right from command-line.
Currently those deployments cannot use pvrdma.
>
> I agree you cannot always assume the QEMU instance is managed by libvirt,
> what about adding this functionality to rdma-multiplexer? The multiplexer
> may
> register to the same QMP event.
Two reasons prevent us from doing this:
- rdmacm-mux is a specific MAD multiplexer for CM packets, lets do not add
management function to it.
- rdmacm-mux might be redundant when MAD multiplexer will be implemented in
kernel. So what then?
>
> Even if you think the multiplexer is not the right way to do it, even a
> simple bash script
> disguised as a systemd service can subscribe to the QMP event and make the
> change on the host.
>
> What do you think?
Another contrib app? A lightweight management software??
See, i do not have an argument if qemu policy is not allowing qemu process
to do external configuration (ex network). I'm just looking from a narrow
perspective of easy deployment - people sometimes runs qemu without libvirt
(or any other management software for that matter), if they want to use
pvrdma they are forced to install libvirt just for that.
>
> Thanks,
> Marcel
>
> >
> > Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
> > ---
> > configure | 6 ++++
> > hw/rdma/rdma_backend.c | 74 +++++++++++++++++++++++++++++++++++++++++-
> > 2 files changed, 79 insertions(+), 1 deletion(-)
> >
> > diff --git a/configure b/configure
> > index 5b183c2e39..1f707b1a62 100755
> > --- a/configure
> > +++ b/configure
> > @@ -3132,6 +3132,8 @@ fi
> > cat > $TMPC <<EOF &&
> > #include <sys/mman.h>
> > +#include <libmnl/libmnl.h>
> > +#include <linux/rtnetlink.h>
> > int
> > main(void)
> > @@ -3144,10 +3146,13 @@ main(void)
> > }
> > EOF
> > +pvrdma_libs="-lmnl"
> > +
> > if test "$rdma" = "yes" ; then
> > case "$pvrdma" in
> > "")
> > if compile_prog "" ""; then
> > + libs_softmmu="$libs_softmmu $pvrdma_libs"
> > pvrdma="yes"
> > else
> > pvrdma="no"
> > @@ -3156,6 +3161,7 @@ if test "$rdma" = "yes" ; then
> > "yes")
> > if ! compile_prog "" ""; then
> > error_exit "PVRDMA is not supported since mremap is not implemented"
> > + " or libmnl-devel is not installed"
> > fi
> > pvrdma="yes"
> > ;;
> > diff --git a/hw/rdma/rdma_backend.c b/hw/rdma/rdma_backend.c
> > index 05f6b03221..bc57b1a624 100644
> > --- a/hw/rdma/rdma_backend.c
> > +++ b/hw/rdma/rdma_backend.c
> > @@ -16,6 +16,11 @@
> > #include "qemu/osdep.h"
> > #include "qapi/qapi-events-rdma.h"
> > +#include "linux/if_addr.h"
> > +#include "libmnl/libmnl.h"
> > +#include "linux/rtnetlink.h"
> > +#include "net/if.h"
> > +
> > #include <infiniband/verbs.h>
> > #include "contrib/rdmacm-mux/rdmacm-mux.h"
> > @@ -47,6 +52,61 @@ static void dummy_comp_handler(void *ctx, struct ibv_wc *wc)
> > rdma_error_report("No completion handler is registered");
> > }
> > +static int netlink_route_update(const char *ifname, union ibv_gid *gid,
> > + __u16 type)
> > +{
> > + char buf[MNL_SOCKET_BUFFER_SIZE];
> > + struct nlmsghdr *nlh;
> > + struct ifaddrmsg *ifm;
> > + struct mnl_socket *nl;
> > + int ret;
> > + uint32_t ipv4;
> > +
> > + nl = mnl_socket_open(NETLINK_ROUTE);
> > + if (!nl) {
> > + rdma_error_report("Fail to connect to netlink\n");
> > + return -EIO;
> > + }
> > +
> > + ret = mnl_socket_bind(nl, 0, MNL_SOCKET_AUTOPID);
> > + if (ret < 0) {
> > + rdma_error_report("Fail to bind to netlink\n");
> > + goto out;
> > + }
> > +
> > + nlh = mnl_nlmsg_put_header(buf);
> > + nlh->nlmsg_type = type;
> > + nlh->nlmsg_flags = NLM_F_REQUEST | NLM_F_CREATE | NLM_F_EXCL;
> > + nlh->nlmsg_seq = 1;
> > +
> > + ifm = mnl_nlmsg_put_extra_header(nlh, sizeof(*ifm));
> > + ifm->ifa_index = if_nametoindex(ifname);
> > + if (gid->global.subnet_prefix) {
> > + ifm->ifa_family = AF_INET6;
> > + ifm->ifa_prefixlen = 64;
> > + ifm->ifa_flags = IFA_F_PERMANENT;
> > + ifm->ifa_scope = RT_SCOPE_UNIVERSE;
> > + mnl_attr_put(nlh, IFA_ADDRESS, sizeof(*gid), gid);
> > + } else {
> > + ifm->ifa_family = AF_INET;
> > + ifm->ifa_prefixlen = 24;
> > + memcpy(&ipv4, (char *)&gid->global.interface_id + 4, sizeof(ipv4));
> > + mnl_attr_put(nlh, IFA_LOCAL, 4, &ipv4);
> > + }
> > +
> > + ret = mnl_socket_sendto(nl, nlh, nlh->nlmsg_len);
> > + if (ret < 0) {
> > + rdma_error_report("Fail to send msg to to netlink\n");
> > + goto out;
> > + }
> > +
> > + ret = 0;
> > +
> > +out:
> > + mnl_socket_close(nl);
> > + return ret;
> > +}
> > +
> > static inline void complete_work(enum ibv_wc_status status, uint32_t vendor_err,
> > void *ctx)
> > {
> > @@ -1123,7 +1183,13 @@ int rdma_backend_add_gid(RdmaBackendDev *backend_dev, const char *ifname,
> > gid->global.subnet_prefix,
> > gid->global.interface_id);
> > - return ret;
> > + /*
> > + * We ignore return value since operation might completed sucessfully
> > + * by the QMP consumer
> > + */
> > + netlink_route_update(ifname, gid, RTM_NEWADDR);
> > +
> > + return 0;
> > }
> > int rdma_backend_del_gid(RdmaBackendDev *backend_dev, const char *ifname,
> > @@ -1149,6 +1215,12 @@ int rdma_backend_del_gid(RdmaBackendDev *backend_dev, const char *ifname,
> > gid->global.subnet_prefix,
> > gid->global.interface_id);
> > + /*
> > + * We ignore return value since operation might completed sucessfully
> > + * by the QMP consumer
> > + */
> > + netlink_route_update(ifname, gid, RTM_DELADDR);
> > +
> > return 0;
> > }
>
next prev parent reply other threads:[~2019-05-26 6:57 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-05 10:55 [Qemu-devel] [PATCH] hw/rdma: Add support for GID state changes for non-qmp frameworks Yuval Shaia
2019-05-05 10:55 ` Yuval Shaia
2019-05-06 15:09 ` Eric Blake
2019-05-06 16:34 ` Yuval Shaia
2019-05-24 5:24 ` Marcel Apfelbaum
2019-05-26 6:41 ` Yuval Shaia [this message]
2019-05-26 10:09 ` Marcel Apfelbaum
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190526064144.GA4309@lap1 \
--to=yuval.shaia@oracle.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).