From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47644) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQYQe-0004OK-Vf for qemu-devel@nongnu.org; Wed, 25 Feb 2015 04:38:54 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YQYQa-0008NT-RD for qemu-devel@nongnu.org; Wed, 25 Feb 2015 04:38:52 -0500 Received: from szxga03-in.huawei.com ([119.145.14.66]:37796) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YQYQZ-0008NG-Kv for qemu-devel@nongnu.org; Wed, 25 Feb 2015 04:38:48 -0500 Message-ID: <54ED9814.10604@huawei.com> Date: Wed, 25 Feb 2015 17:38:28 +0800 From: zhanghailiang MIME-Version: 1.0 References: <1423711034-5340-1-git-send-email-zhang.zhanghailiang@huawei.com> <1423711034-5340-20-git-send-email-zhang.zhanghailiang@huawei.com> <20150216120341.GA2299@work-vm> <54ED4526.7000507@huawei.com> <20150225090843.GA2522@work-vm> In-Reply-To: <20150225090843.GA2522@work-vm> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH RFC v3 19/27] COLO NIC: Implement colo nic device interface configure() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: hangaohuai@huawei.com, Li Zhijian , yunhong.jiang@intel.com, eddie.dong@intel.com, peter.huangpeng@huawei.com, qemu-devel@nongnu.org, Gao feng , stefanha@redhat.com, pbonzini@redhat.com On 2015/2/25 17:08, Dr. David Alan Gilbert wrote: > * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote: >> On 2015/2/16 20:03, Dr. David Alan Gilbert wrote: >>> * zhanghailiang (zhang.zhanghailiang@huawei.com) wrote: >>>> Implement colo nic device interface configure() >>>> add a script to configure nic devices: >>>> ${QEMU_SCRIPT_DIR}/colo-proxy-script.sh >>> >>> Do you have some more documentation of the new colo-proxy? I've >> >> Yes, gaofeng is writing it now... > > Great. > >>> been reading the kernel module source and I can see that it's >>> a nice idea to do the sequence number adjustment on the host, >>> that reduces the need to modify the guest kernel; I was trying to >>> figure out how you synchronise the master/slave idea of sequence numbers - >>> is that purely from the 'ack' that's duplicated back to the secondary? >> >> Yes, you've got it :) >> >>> If you were unlucky and the 'ack' packet was lost on the duplicated >>> link from the primary to secondary how would you recover? >> >> The 'ack' packet will be consider to be lost, because the primary will not >> respond to this 'ack' packet until it got secondary's response, >> and client will resend it ('ack' packet). >> >>> What about TCP connections setup before colo was activated? >>> >> >> Actually, now, we only support activate colo before guest is startup (for test procedure, >> '-S' is needed for qemu command line). > > Consider this: > 1) Start primary > 2) Start secondary > 3) Start the colo pairing > 4) Primary fails > 5) Colo failover to secondary > > Now we have only the old secondary running; we'd really like to get back to > having a pair of fault-tolerant hosts, so it would be good to be able to: > > 6) Make the old secondary the new primary > 7) Add a new secondary > 8) Start colo-pairing to the new secondary > Er, what you described is continuous FT, yes, it is in our TODO list. > You could theoretically do this with colo-agent, but not with colo-proxy. > We have decided using colo-proxy, it has more advantages. >>> The other thought is that passing the 'sec_dev' as a module parameter >>> gives you an artificial limitation; it forces all of the pairs >>> to be between the same pair of hosts. If the 'sec_dev' was a parameter >>> to the connection then you could have different slaves associated with >>> each guest on the primary host. >>> >> >> Hmm, do you mean we should pass this 'sec_dev' as a parameter from qemu to proxy module by >> maybe ioctl ? > > Yes, ioctl or tc or whatever; and make it per-guest. > OK. Thanks. >> Yes, it is ugly to pass this 'sec_dev' directly to module as parameter. >> We will consider this, thanks ;) > > Thanks! > >>> Dave >>> P.S. You probably need to clean the debug messages up in the kernel module! >>> >> >> OK, will do that. > > Thanks. > > Dave > >> >>>> Signed-off-by: zhanghailiang >>>> Signed-off-by: Gao feng >>>> Signed-off-by: Li Zhijian >>>> --- >>>> net/colo-nic.c | 56 +++++++++++++++++++++++++++- >>>> scripts/colo-proxy-script.sh | 88 ++++++++++++++++++++++++++++++++++++++++++++ >>>> 2 files changed, 143 insertions(+), 1 deletion(-) >>>> create mode 100755 scripts/colo-proxy-script.sh >>>> >>>> diff --git a/net/colo-nic.c b/net/colo-nic.c >>>> index 965af49..f8fc35d 100644 >>>> --- a/net/colo-nic.c >>>> +++ b/net/colo-nic.c >>>> @@ -39,12 +39,66 @@ static bool colo_nic_support(NetClientState *nc) >>>> return nc && nc->colo_script[0] && nc->colo_nicname[0]; >>>> } >>>> >>>> +static int launch_colo_script(char *argv[]) >>>> +{ >>>> + int pid, status; >>>> + char *script = argv[0]; >>>> + >>>> + /* try to launch network script */ >>>> + pid = fork(); >>>> + if (pid == 0) { >>>> + execv(script, argv); >>>> + _exit(1); >>>> + } else if (pid > 0) { >>>> + while (waitpid(pid, &status, 0) != pid) { >>>> + /* loop */ >>>> + } >>>> + >>>> + if (WIFEXITED(status) && WEXITSTATUS(status) == 0) { >>>> + return 0; >>>> + } >>>> + } >>>> + return -1; >>>> +} >>>> + >>>> +static int colo_nic_configure(NetClientState *nc, >>>> + bool up, int side, int index) >>>> +{ >>>> + int i, argc = 6; >>>> + char *argv[7], index_str[32]; >>>> + char **parg; >>>> + >>>> + if (!nc && index <= 0) { >>>> + error_report("Can not parse colo_script or colo_nicname"); >>>> + return -1; >>>> + } >>>> + >>>> + parg = argv; >>>> + *parg++ = nc->colo_script; >>>> + *parg++ = (char *)(side == COLO_SECONDARY_MODE ? "slave" : "master"); >>>> + *parg++ = (char *)(up ? "install" : "uninstall"); >>>> + *parg++ = nc->colo_nicname; >>>> + *parg++ = nc->ifname; >>>> + sprintf(index_str, "%d", index); >>>> + *parg++ = index_str; >>>> + *parg = NULL; >>>> + >>>> + for (i = 0; i < argc; i++) { >>>> + if (!argv[i][0]) { >>>> + error_report("Can not get colo_script argument"); >>>> + return -1; >>>> + } >>>> + } >>>> + >>>> + return launch_colo_script(argv); >>>> +} >>>> + >>>> void colo_add_nic_devices(NetClientState *nc) >>>> { >>>> struct nic_device *nic = g_malloc0(sizeof(*nic)); >>>> >>>> nic->support_colo = colo_nic_support; >>>> - nic->configure = NULL; >>>> + nic->configure = colo_nic_configure; >>>> /* >>>> * TODO >>>> * only support "-netdev tap,colo_scripte..." options >>>> diff --git a/scripts/colo-proxy-script.sh b/scripts/colo-proxy-script.sh >>>> new file mode 100755 >>>> index 0000000..c7aa53f >>>> --- /dev/null >>>> +++ b/scripts/colo-proxy-script.sh >>>> @@ -0,0 +1,88 @@ >>>> +#!/bin/sh >>>> +#usage: ./colo-proxy-script.sh master/slave install/uninstall phy_if virt_if index >>>> +#.e.g ./colo-proxy-script.sh master install eth2 tap0 1 >>>> + >>>> +side=$1 >>>> +action=$2 >>>> +phy_if=$3 >>>> +virt_if=$4 >>>> +index=$5 >>>> +br=br1 >>>> +failover_br=br0 >>>> + >>>> +script_usage() >>>> +{ >>>> + echo -n "usage: ./colo-proxy-script.sh master/slave " >>>> + echo -e "install/uninstall phy_if virt_if index\n" >>>> +} >>>> + >>>> +master_install() >>>> +{ >>>> + tc qdisc add dev $virt_if root handle 1: prio >>>> + tc filter add dev $virt_if parent 1: protocol ip prio 10 u32 match u32 0 0 flowid 1:2 action mirred egress mirror dev $phy_if >>>> + tc filter add dev $virt_if parent 1: protocol arp prio 11 u32 match u32 0 0 flowid 1:2 action mirred egress mirror dev $phy_if >>>> + tc filter add dev $virt_if parent 1: protocol ipv6 prio 12 u32 match u32 0 0 flowid 1:2 action mirred egress mirror dev $phy_if >>>> + >>>> + modprobe nf_conntrack_ipv4 >>>> + modprobe xt_PMYCOLO sec_dev=$phy_if >>>> + >>>> + /usr/local/sbin/iptables -t mangle -I PREROUTING -m physdev --physdev-in $virt_if -j PMYCOLO --index $index >>>> + /usr/local/sbin/ip6tables -t mangle -I PREROUTING -m physdev --physdev-in $virt_if -j PMYCOLO --index $index >>>> + /usr/local/sbin/arptables -I INPUT -i $phy_if -j MARK --set-mark $index >>>> +} >>>> + >>>> +master_uninstall() >>>> +{ >>>> + tc filter del dev $virt_if parent 1: protocol ip prio 10 u32 match u32 0 0 flowid 1:2 action mirred egress mirror dev $phy_if >>>> + tc filter del dev $virt_if parent 1: protocol arp prio 11 u32 match u32 0 0 flowid 1:2 action mirred egress mirror dev $phy_if >>>> + tc filter del dev $virt_if parent 1: protocol ipv6 prio 12 u32 match u32 0 0 flowid 1:2 action mirred egress mirror dev $phy_if >>>> + tc qdisc del dev $virt_if root handle 1: prio >>>> + >>>> + /usr/local/sbin/iptables -t mangle -F >>>> + /usr/local/sbin/ip6tables -t mangle -F >>>> + /usr/local/sbin/arptables -F >>>> + rmmod xt_PMYCOLO >>>> +} >>>> + >>>> +slave_install() >>>> +{ >>>> + brctl addif $br $phy_if >>>> + modprobe xt_SECCOLO >>>> + >>>> + /usr/local/sbin/iptables -t mangle -I PREROUTING -m physdev --physdev-in $virt_if -j SECCOLO --index $index >>>> + /usr/local/sbin/ip6tables -t mangle -I PREROUTING -m physdev --physdev-in $virt_if -j SECCOLO --index $index >>>> +} >>>> + >>>> + >>>> +slave_uninstall() >>>> +{ >>>> + brctl delif $br $phy_if >>>> + brctl delif $br $virt_if >>>> + brctl addif $failover_br $virt_if >>>> + >>>> + /usr/local/sbin/iptables -t mangle -F >>>> + /usr/local/sbin/ip6tables -t mangle -F >>>> + rmmod xt_SECCOLO >>>> +} >>>> + >>>> +if [ $# -ne 5 ]; then >>>> + script_usage >>>> + exit 1 >>>> +fi >>>> + >>>> +if [ "x$side" != "xmaster" ] && [ "x$side" != "xslave" ]; then >>>> + script_usage >>>> + exit 2 >>>> +fi >>>> + >>>> +if [ "x$action" != "xinstall" ] && [ "x$action" != "xuninstall" ]; then >>>> + script_usage >>>> + exit 3 >>>> +fi >>>> + >>>> +if [ $index -lt 0 ] || [ $index -gt 100 ]; then >>>> + echo "index overflow" >>>> + exit 4 >>>> +fi >>>> + >>>> +${side}_${action} >>>> -- >>>> 1.7.12.4 >>>> >>>> >>> -- >>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK >>> >>> . >>> >> >> > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > > . >