* ipvsadm: One-packet scheduling with UDP service is unstable
@ 2013-08-20 15:06 Drunkard Zhang
2013-08-22 6:43 ` Julian Anastasov
0 siblings, 1 reply; 15+ messages in thread
From: Drunkard Zhang @ 2013-08-20 15:06 UTC (permalink / raw)
To: Wensong Zhang, Simon Horman, Julian Anastasov, Pablo Neira Ayuso,
Patrick McHardy, Jozsef Kadlecsik, David S. Miller, netdev,
lvs-devel, netfilter-devel, netfilter, coreteam, linux-kernel
Need help here, thank you for replying :-)
I'm setting up a syslog cluster based on IPVS, all UDP datagrams sent
from firewall with fixed source IP and fixed source port, so
pseudo-random balancing based on client IP and port won't working. And
it seems that keepalived is not supporting One-packet scheduling
option, so I did some hacks on it after keepalived started:
1. dump LVS rules with ipvsadm -S -n > rules-vs3;
2. add --ops option;
3. restore LVS rules with ipvsadm-restore < rules-vs3;
4. dump the running LVS rules with ipvsadm -S -n
So, I got two problems here:
1. Dumped rules in step 4 above is not usable anymore, the double-dash
in --ops lost, so I can't restore rule with this dump anymore. This
must be a bug.
2. The --ops option is not working sometimes you applied the rules,
and in most of times the --ops just not working. To make it work, just
'ipvsadm-restore < rules-vs3' for plenty of times until it's working.
I haven't find the patterns make it work yet. This is lucky, I can't
get it work on second host at all.
The "not working" above means the UDP datagrams from one source IP is
sticked to one realserver, it doesn't distribute to other realservers
which --ops designed for.
So I wondering if there's some CONFIG_* options that ipvs needs, or
recent development broke the code?
Here's my hosts, both running updated Gentoo. This is the second host
that --ops doesn't work at all:
vs3 ~ # uname -r
3.10.7-gentoo
vs3 ~ # lspci |grep net
01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5716
Gigabit Ethernet (rev 20)
01:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5716
Gigabit Ethernet (rev 20)
vs3 ~ # emerge --info ipvsadm keepalived
Portage 2.1.12.2 (default/linux/amd64/13.0, gcc-4.6.3, glibc-2.15-r3,
3.10.7-gentoo x86_64)
=================================================================
System Settings
=================================================================
System uname: Linux-3.10.7-gentoo-x86_64-Intel-R-_Xeon-R-_CPU_E5620_@_2.40GHz-with-gentoo-2.2
KiB Mem: 16423692 total, 15907924 free
KiB Swap: 0 total, 0 free
Timestamp of tree: Mon, 19 Aug 2013 21:30:01 +0000
ld GNU ld (GNU Binutils) 2.23.1
app-shells/bash: 4.2_p45
dev-lang/python: 2.7.5, 3.2.5-r1
dev-util/pkgconfig: 0.28
sys-apps/baselayout: 2.2
sys-apps/openrc: 0.11.8
sys-apps/sandbox: 2.6-r1
sys-devel/autoconf: 2.69
sys-devel/automake: 1.12.6
sys-devel/binutils: 2.23.1
sys-devel/gcc: 4.6.3
sys-devel/gcc-config: 1.7.3
sys-devel/libtool: 2.4-r1
sys-devel/make: 3.82-r4
sys-kernel/linux-headers: 3.7 (virtual/os-headers)
sys-libs/glibc: 2.15-r3
Repositories: gentoo
ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=corei7 -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /var/bind"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/gconf
/etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-march=corei7 -O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-logs config-protect-if-modified
distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch
preserve-libs protect-owned sandbox sfperms strict
unknown-features-warn unmerge-logs unmerge-orphans userfetch"
FFLAGS="-O2 -pipe"
LANG="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j17"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times
--compress --force --whole-file --delete --stats --human-readable
--timeout=180 --exclude=/distfiles --exclude=/local
--exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY=""
USE="acl acpi aio amd64 bash-completion berkdb bzip2 cli cracklib
crypt cxx dri fortran gdbm iconv ipv6 mmap mmx modules mudflap
multilib ncurses nls nptl openmp pam pcre readline session smp sse
sse2 ssl ssse3 tcpd threads unicode vim-syntax zlib" ABI_X86="64"
ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci
emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0
intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci"
APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions
alias auth_basic authn_alias authn_anon authn_dbm authn_default
authn_file authz_dbm authz_default authz_groupfile authz_host
authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock
deflate dir disk_cache env expires ext_filter file_cache filter
headers include info log_config logio mem_cache mime mime_magic
negotiation rewrite setenvif speling status unique_id userdir
usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets
stage tables krita karbon braindump author" CAMERAS="ptp2"
COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog"
ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18
garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver
oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip
tripmate tnt ubx" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux"
LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb
ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console
presenter-minimizer" OFFICE_IMPLEMENTATION="libreoffice"
PHP_TARGETS="php5-4" PYTHON_SINGLE_TARGET="python2_7"
PYTHON_TARGETS="python2_7 python3_2" RUBY_TARGETS="ruby19 ruby18"
USERLAND="GNU" VIDEO_CARDS="fbdev glint intel mach64 mga nouveau nv
r128 radeon savage sis tdfx trident vesa via vmware dummy v4l"
XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset
ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat
logmark ipmark dhcpmac delude chaos account"
Unset: CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL,
PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS,
PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON
=================================================================
Package Settings
=================================================================
sys-cluster/ipvsadm-1.26-r2 was built with the following:
USE="(multilib) -static-libs" ABI_X86="64"
sys-cluster/keepalived-1.2.2-r4 was built with the following:
USE="ipv6 (multilib) -debug" ABI_X86="64"
And this is the host that --ops works occasionally:
vs4 ~ # uname -r
3.10.7-gentoo
vs4 ~ # lspci |grep net
05:00.0 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit
Ethernet Controller (Copper) (rev 01)
05:00.1 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit
Ethernet Controller (Copper) (rev 01)
vs4 ~ # emerge --info ipvsadm keepalived
Portage 2.1.12.2 (default/linux/amd64/13.0, gcc-4.6.3, glibc-2.15-r3,
3.10.7-gentoo x86_64)
=================================================================
System Settings
=================================================================
System uname: Linux-3.10.7-gentoo-x86_64-Intel-R-_Xeon-R-_CPU_E5405_@_2.00GHz-with-gentoo-2.2
KiB Mem: 4046544 total, 3192820 free
KiB Swap: 0 total, 0 free
Timestamp of tree: Fri, 16 Aug 2013 21:30:01 +0000
ld GNU ld (GNU Binutils) 2.23.1
app-shells/bash: 4.2_p45
dev-lang/python: 2.7.5, 3.2.5-r1
dev-util/pkgconfig: 0.28
sys-apps/baselayout: 2.2
sys-apps/openrc: 0.11.8
sys-apps/sandbox: 2.6-r1
sys-devel/autoconf: 2.69
sys-devel/automake: 1.12.6
sys-devel/binutils: 2.23.1
sys-devel/gcc: 4.6.3
sys-devel/gcc-config: 1.7.3
sys-devel/libtool: 2.4-r1
sys-devel/make: 3.82-r4
sys-kernel/linux-headers: 3.7 (virtual/os-headers)
sys-libs/glibc: 2.15-r3
Repositories: gentoo
ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=core2 -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /var/bind"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/gconf
/etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-march=core2 -O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-logs config-protect-if-modified
distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch
preserve-libs protect-owned sandbox sfperms strict
unknown-features-warn unmerge-logs unmerge-orphans userfetch"
FFLAGS="-O2 -pipe"
LANG="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j5"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times
--compress --force --whole-file --delete --stats --human-readable
--timeout=180 --exclude=/distfiles --exclude=/local
--exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY=""
USE="acl acpi aio amd64 bash-completion berkdb bzip2 cli cracklib
crypt cxx dri fortran gdbm iconv ipv6 mmap mmx modules mudflap
multilib ncurses nls nptl openmp pam pcre readline session smp sse
sse2 ssl ssse3 threads unicode vim-syntax zlib" ABI_X86="64"
ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci
emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0
intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci"
APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions
alias auth_basic authn_alias authn_anon authn_dbm authn_default
authn_file authz_dbm authz_default authz_groupfile authz_host
authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock
deflate dir disk_cache env expires ext_filter file_cache filter
headers include info log_config logio mem_cache mime mime_magic
negotiation rewrite setenvif speling status unique_id userdir
usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets
stage tables krita karbon braindump author" CAMERAS="ptp2"
COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog"
ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18
garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver
oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip
tripmate tnt ubx" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux"
LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb
ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console
presenter-minimizer" OFFICE_IMPLEMENTATION="libreoffice"
PHP_TARGETS="php5-4" PYTHON_SINGLE_TARGET="python2_7"
PYTHON_TARGETS="python2_7 python3_2" RUBY_TARGETS="ruby19 ruby18"
USERLAND="GNU" VIDEO_CARDS="fbdev glint intel mach64 mga nouveau nv
r128 radeon savage sis tdfx trident vesa via vmware dummy v4l"
XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset
ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat
logmark ipmark dhcpmac delude chaos account"
Unset: CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL,
PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS,
PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON
=================================================================
Package Settings
=================================================================
sys-cluster/ipvsadm-1.26-r2 was built with the following:
USE="(multilib) -static-libs" ABI_X86="64"
sys-cluster/keepalived-1.2.2-r4 was built with the following:
USE="ipv6 (multilib) -debug" ABI_X86="64"
Do I need to attach kernel config file?
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: ipvsadm: One-packet scheduling with UDP service is unstable 2013-08-20 15:06 ipvsadm: One-packet scheduling with UDP service is unstable Drunkard Zhang @ 2013-08-22 6:43 ` Julian Anastasov 2013-08-22 10:58 ` Drunkard Zhang 0 siblings, 1 reply; 15+ messages in thread From: Julian Anastasov @ 2013-08-22 6:43 UTC (permalink / raw) To: Drunkard Zhang Cc: Wensong Zhang, Simon Horman, Pablo Neira Ayuso, Patrick McHardy, Jozsef Kadlecsik, David S. Miller, netdev, lvs-devel, netfilter-devel, netfilter, coreteam, linux-kernel Hello, On Tue, 20 Aug 2013, Drunkard Zhang wrote: > Need help here, thank you for replying :-) > > I'm setting up a syslog cluster based on IPVS, all UDP datagrams sent > from firewall with fixed source IP and fixed source port, so > pseudo-random balancing based on client IP and port won't working. And > it seems that keepalived is not supporting One-packet scheduling > option, so I did some hacks on it after keepalived started: > > 1. dump LVS rules with ipvsadm -S -n > rules-vs3; > 2. add --ops option; > 3. restore LVS rules with ipvsadm-restore < rules-vs3; > 4. dump the running LVS rules with ipvsadm -S -n > > So, I got two problems here: > > 1. Dumped rules in step 4 above is not usable anymore, the double-dash > in --ops lost, so I can't restore rule with this dump anymore. This > must be a bug. > > 2. The --ops option is not working sometimes you applied the rules, > and in most of times the --ops just not working. To make it work, just > 'ipvsadm-restore < rules-vs3' for plenty of times until it's working. > I haven't find the patterns make it work yet. This is lucky, I can't > get it work on second host at all. > > The "not working" above means the UDP datagrams from one source IP is > sticked to one realserver, it doesn't distribute to other realservers > which --ops designed for. Can you try with recent ipvsadm from git: git clone git://git.kernel.org/pub/scm/utils/kernel/ipvsadm/ipvsadm.git I see related commit that will print -o for the OPS feature: === commit 6a03100c189d00e3a8235215392465b5b877ba8f Author: Krzysztof Gajdemski <songo@debian.org.pl> Date: Thu Mar 21 11:40:06 2013 +0100 ipvsadm: Fix wrong format of -o option in FMT_RULE listing 'ipvsadm -S' listed one-packet scheduling option in wrong format ('ops' instead of '--ops' or '-o') preventing any service with OPS feature from restoring using 'ipvsadm -R'. Now we use '-o' which works well with save/restore commands. Signed-off-by: Krzysztof Gajdemski <songo@debian.org.pl> Signed-off-by: Simon Horman <horms@verge.net.au> === Let me know if you still have any problems with OPS. Sending to lvs-devel@vger.kernel.org and lvs-users@linuxvirtualserver.org should be enough for ipvsadm related discussions. > So I wondering if there's some CONFIG_* options that ipvs needs, or > recent development broke the code? No kernel options should be related to OPS. I assume you are not using the SH scheduler. Make sure the OPS mode is properly applied to the virtual service, check for "ops" in the configuration: cat /proc/net/ip_vs > Do I need to attach kernel config file? No Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ipvsadm: One-packet scheduling with UDP service is unstable 2013-08-22 6:43 ` Julian Anastasov @ 2013-08-22 10:58 ` Drunkard Zhang 2013-08-22 14:14 ` Julian Anastasov 0 siblings, 1 reply; 15+ messages in thread From: Drunkard Zhang @ 2013-08-22 10:58 UTC (permalink / raw) To: Julian Anastasov Cc: Wensong Zhang, Simon Horman, Pablo Neira Ayuso, Patrick McHardy, Jozsef Kadlecsik, David S. Miller, netdev, lvs-devel, netfilter-devel, netfilter, coreteam, linux-kernel 2013/8/22 Julian Anastasov <ja@ssi.bg>: > > Hello, > > On Tue, 20 Aug 2013, Drunkard Zhang wrote: > >> Need help here, thank you for replying :-) >> >> I'm setting up a syslog cluster based on IPVS, all UDP datagrams sent >> from firewall with fixed source IP and fixed source port, so >> pseudo-random balancing based on client IP and port won't working. And >> it seems that keepalived is not supporting One-packet scheduling >> option, so I did some hacks on it after keepalived started: >> >> 1. dump LVS rules with ipvsadm -S -n > rules-vs3; >> 2. add --ops option; >> 3. restore LVS rules with ipvsadm-restore < rules-vs3; >> 4. dump the running LVS rules with ipvsadm -S -n >> >> So, I got two problems here: >> >> 1. Dumped rules in step 4 above is not usable anymore, the double-dash >> in --ops lost, so I can't restore rule with this dump anymore. This >> must be a bug. >> >> 2. The --ops option is not working sometimes you applied the rules, >> and in most of times the --ops just not working. To make it work, just >> 'ipvsadm-restore < rules-vs3' for plenty of times until it's working. >> I haven't find the patterns make it work yet. This is lucky, I can't >> get it work on second host at all. >> >> The "not working" above means the UDP datagrams from one source IP is >> sticked to one realserver, it doesn't distribute to other realservers >> which --ops designed for. > > Can you try with recent ipvsadm from git: > > git clone git://git.kernel.org/pub/scm/utils/kernel/ipvsadm/ipvsadm.git > > I see related commit that will print -o for > the OPS feature: > > === > commit 6a03100c189d00e3a8235215392465b5b877ba8f > Author: Krzysztof Gajdemski <songo@debian.org.pl> > Date: Thu Mar 21 11:40:06 2013 +0100 > > ipvsadm: Fix wrong format of -o option in FMT_RULE listing > > 'ipvsadm -S' listed one-packet scheduling option in wrong format > ('ops' instead of '--ops' or '-o') preventing any service with OPS > feature from restoring using 'ipvsadm -R'. Now we use '-o' which > works well with save/restore commands. > > Signed-off-by: Krzysztof Gajdemski <songo@debian.org.pl> > Signed-off-by: Simon Horman <horms@verge.net.au> > === > > Let me know if you still have any problems with OPS. > Sending to lvs-devel@vger.kernel.org and > lvs-users@linuxvirtualserver.org should be enough for > ipvsadm related discussions. Thanks, this resolved my first problem :D >> So I wondering if there's some CONFIG_* options that ipvs needs, or >> recent development broke the code? > > No kernel options should be related to OPS. I assume > you are not using the SH scheduler. Make sure the OPS mode > is properly applied to the virtual service, check for "ops" > in the configuration: > > cat /proc/net/ip_vs Still no lucky here, ops is set in running config, but it's not like that in real world. vs3 ~ # cat /proc/net/ip_vs IP Virtual Server version 1.2.1 (size=1024) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn UDP 96A46478:0202 wrr ops -> 96A46459:0202 Route 0 0 0 -> 96A46458:0202 Route 0 0 0 -> 96A46457:0202 Route 0 0 0 -> 96A46456:0202 Route 0 0 0 -> 96A46455:0202 Route 0 0 0 -> 96A46454:0202 Route 0 0 0 -> 96A46453:0202 Route 0 0 0 -> 96A46452:0202 Route 0 0 0 -> 96A46451:0202 Route 0 0 0 -> 96A46450:0202 Route 25 0 1 -> 96A4644F:0202 Route 25 0 1 -> 96A4644E:0202 Route 25 0 1 -> 96A4644D:0202 Route 30 0 2 -> 96A4644C:0202 Route 20 0 1 -> 96A4644B:0202 Route 20 0 1 -> 96A4644A:0202 Route 25 0 1 -> 96A46449:0202 Route 20 0 1 -> 96A46448:0202 Route 25 0 1 -> 96A46447:0202 Route 20 0 1 -> 96A46446:0202 Route 20 0 1 -> 96A46445:0202 Route 20 0 1 -> 96A46444:0202 Route 25 0 1 -> 96A46443:0202 Route 15 0 1 -> 96A46442:0202 Route 20 0 1 -> 96A46441:0202 Route 20 0 1 And the traffic routed to each realserver didn't following weight I set, it's routed pretty much one to one. I got 17 udp sources sending to 16 different realservers, the others are bonding to another VIP. Prot LocalAddress:Port CPS InPPS OutPPS InBPS OutBPS -> RemoteAddress:Port UDP x.x.x.120:514 0 67622 0 12339373 0 -> x.x.x.65:514 0 29 0 2895 0 -> x.x.x.66:514 0 225 0 21850 0 -> x.x.x.67:514 0 4003 0 586117 0 -> x.x.x.68:514 0 5049 0 781526 0 -> x.x.x.69:514 0 160 0 16163 0 -> x.x.x.70:514 0 6091 0 914365 0 -> x.x.x.71:514 0 757 0 74428 0 -> x.x.x.72:514 0 4716 0 736039 0 -> x.x.x.73:514 0 4167 0 663728 0 -> x.x.x.74:514 0 3800 0 571342 0 -> x.x.x.75:514 0 192 0 19467 0 -> x.x.x.76:514 0 11309 0 1889147 0 -> x.x.x.77:514 0 3052 0 309840 0 -> x.x.x.78:514 0 8336 0 2004194 0 -> x.x.x.79:514 0 7333 0 1747346 0 -> x.x.x.80:514 0 8403 0 2000929 0 -> x.x.x.81:514 0 0 0 0 0 -> x.x.x.82:514 0 0 0 0 0 -> x.x.x.83:514 0 0 0 0 0 -> x.x.x.84:514 0 0 0 0 0 -> x.x.x.85:514 0 0 0 0 0 -> x.x.x.86:514 0 0 0 0 0 -> x.x.x.87:514 0 0 0 0 0 -> x.x.x.88:514 0 0 0 0 0 -> x.x.x.89:514 0 0 0 0 0 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ipvsadm: One-packet scheduling with UDP service is unstable 2013-08-22 10:58 ` Drunkard Zhang @ 2013-08-22 14:14 ` Julian Anastasov 2013-08-22 23:24 ` Drunkard Zhang 0 siblings, 1 reply; 15+ messages in thread From: Julian Anastasov @ 2013-08-22 14:14 UTC (permalink / raw) To: Drunkard Zhang Cc: Wensong Zhang, Simon Horman, Pablo Neira Ayuso, Patrick McHardy, Jozsef Kadlecsik, David S. Miller, netdev, lvs-devel, netfilter-devel, netfilter, coreteam, linux-kernel Hello, On Thu, 22 Aug 2013, Drunkard Zhang wrote: > 2013/8/22 Julian Anastasov <ja@ssi.bg>: > > > > No kernel options should be related to OPS. I assume > > you are not using the SH scheduler. Make sure the OPS mode > > is properly applied to the virtual service, check for "ops" > > in the configuration: > > > > cat /proc/net/ip_vs > > Still no lucky here, ops is set in running config, but it's not like > that in real world. > > vs3 ~ # cat /proc/net/ip_vs > IP Virtual Server version 1.2.1 (size=1024) > Prot LocalAddress:Port Scheduler Flags > -> RemoteAddress:Port Forward Weight ActiveConn InActConn > UDP 96A46478:0202 wrr ops > -> 96A46450:0202 Route 25 0 1 The OPS connections are accounted in InActConn for a very short period, they live up to 1 jiffie, eg. 10ms. Also, WRR should be reliable for OPS while other schedulers (eg. *LC) are not suitable. > And the traffic routed to each realserver didn't following weight I > set, it's routed pretty much one to one. I got 17 udp sources sending > to 16 different realservers, the others are bonding to another VIP. > > Prot LocalAddress:Port CPS InPPS OutPPS InBPS OutBPS > -> RemoteAddress:Port > UDP x.x.x.120:514 0 67622 0 12339373 0 > -> x.x.x.65:514 0 29 0 2895 0 > -> x.x.x.66:514 0 225 0 21850 0 Do you see the same problem with ipvsadm -Ln --stats ? ipvsadm -Z may be needed to zero the stats after restoring all rules. "Conns" counter in stats should be according to WRR weights, it shows the scheduler decisions. In your rates listing CPS 0 is confusing, even for OPS. Is it from the new ipvsadm? Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ipvsadm: One-packet scheduling with UDP service is unstable 2013-08-22 14:14 ` Julian Anastasov @ 2013-08-22 23:24 ` Drunkard Zhang [not found] ` <alpine.LFD.2.00.1308231708450.1852@ja.ssi.bg> 0 siblings, 1 reply; 15+ messages in thread From: Drunkard Zhang @ 2013-08-22 23:24 UTC (permalink / raw) To: Julian Anastasov Cc: Wensong Zhang, Simon Horman, Pablo Neira Ayuso, Patrick McHardy, Jozsef Kadlecsik, David S. Miller, netdev, lvs-devel, netfilter-devel, netfilter, coreteam, linux-kernel 2013/8/22 Julian Anastasov <ja@ssi.bg>: > > Hello, > > On Thu, 22 Aug 2013, Drunkard Zhang wrote: > >> 2013/8/22 Julian Anastasov <ja@ssi.bg>: >> > >> > No kernel options should be related to OPS. I assume >> > you are not using the SH scheduler. Make sure the OPS mode >> > is properly applied to the virtual service, check for "ops" >> > in the configuration: >> > >> > cat /proc/net/ip_vs >> >> Still no lucky here, ops is set in running config, but it's not like >> that in real world. >> >> vs3 ~ # cat /proc/net/ip_vs >> IP Virtual Server version 1.2.1 (size=1024) >> Prot LocalAddress:Port Scheduler Flags >> -> RemoteAddress:Port Forward Weight ActiveConn InActConn >> UDP 96A46478:0202 wrr ops > >> -> 96A46450:0202 Route 25 0 1 > > The OPS connections are accounted in InActConn > for a very short period, they live up to 1 jiffie, eg. 10ms. > Also, WRR should be reliable for OPS while other > schedulers (eg. *LC) are not suitable. I noticed this too. While ops working, the InActConn is always changing too, if it's fixed, the ops is not working. >> And the traffic routed to each realserver didn't following weight I >> set, it's routed pretty much one to one. I got 17 udp sources sending >> to 16 different realservers, the others are bonding to another VIP. >> >> Prot LocalAddress:Port CPS InPPS OutPPS InBPS OutBPS >> -> RemoteAddress:Port >> UDP x.x.x.120:514 0 67622 0 12339373 0 >> -> x.x.x.65:514 0 29 0 2895 0 >> -> x.x.x.66:514 0 225 0 21850 0 > > Do you see the same problem with ipvsadm -Ln --stats ? > ipvsadm -Z may be needed to zero the stats after restoring all > rules. "Conns" counter in stats should be according to WRR > weights, it shows the scheduler decisions. After every restore, the stats also zeroed, right? While, ops still not working. vs3 ~/pkgs # ./ipvsadm -Z vs3 ~/pkgs # ./ipvsadm -ln --stats -u [snipped] Prot LocalAddress:Port Conns InPkts OutPkts InBytes OutBytes -> RemoteAddress:Port UDP x.x.x.120:514 0 12497040 0 2572M 0 -> x.x.x.65:514 0 3975 0 394171 0 -> x.x.x.66:514 0 48466 0 4835716 0 -> x.x.x.67:514 0 407051 0 58479621 0 -> x.x.x.68:514 0 561120 0 85289892 0 -> x.x.x.69:514 0 30958 0 3120506 0 -> x.x.x.70:514 0 645475 0 100552K 0 -> x.x.x.71:514 0 147228 0 14560649 0 -> x.x.x.72:514 0 535693 0 84069390 0 -> x.x.x.73:514 0 564787 0 88165140 0 -> x.x.x.74:514 0 346734 0 53256088 0 -> x.x.x.75:514 0 47232 0 4801578 0 -> x.x.x.76:514 0 1175288 0 192699K 0 -> x.x.x.77:514 0 254915 0 25939720 0 -> x.x.x.78:514 0 2701531 0 652417K 0 -> x.x.x.79:514 0 2426686 0 573897K 0 -> x.x.x.80:514 0 2599901 0 629793K 0 -> x.x.x.81:514 0 0 0 0 0 -> x.x.x.82:514 0 0 0 0 0 -> x.x.x.83:514 0 0 0 0 0 -> x.x.x.84:514 0 0 0 0 0 -> x.x.x.85:514 0 0 0 0 0 -> x.x.x.86:514 0 0 0 0 0 -> x.x.x.87:514 0 0 0 0 0 -> x.x.x.88:514 0 0 0 0 0 -> x.x.x.89:514 0 0 0 0 0 > In your rates listing CPS 0 is confusing, even for OPS. > Is it from the new ipvsadm? Yes, latest git version. When CPS is changing, the ops works, or it's not. ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <alpine.LFD.2.00.1308231708450.1852@ja.ssi.bg>]
* Re: ipvsadm: One-packet scheduling with UDP service is unstable [not found] ` <alpine.LFD.2.00.1308231708450.1852@ja.ssi.bg> @ 2013-08-24 11:14 ` Drunkard Zhang 2013-08-24 13:17 ` Julian Anastasov 0 siblings, 1 reply; 15+ messages in thread From: Drunkard Zhang @ 2013-08-24 11:14 UTC (permalink / raw) To: Julian Anastasov; +Cc: lvs-devel 2013/8/23 Julian Anastasov <ja@ssi.bg>: > > Hello, > > On Fri, 23 Aug 2013, Drunkard Zhang wrote: > >> 2013/8/22 Julian Anastasov <ja@ssi.bg>: >> > >> > for a very short period, they live up to 1 jiffie, eg. 10ms. >> > Also, WRR should be reliable for OPS while other >> > schedulers (eg. *LC) are not suitable. >> >> I noticed this too. While ops working, the InActConn is always >> changing too, if it's fixed, the ops is not working. > > All traffic to director stops? No,ingress traffic is always on going. >> > Do you see the same problem with ipvsadm -Ln --stats ? >> > ipvsadm -Z may be needed to zero the stats after restoring all >> > rules. "Conns" counter in stats should be according to WRR >> > weights, it shows the scheduler decisions. >> >> After every restore, the stats also zeroed, right? While, ops still not working. > > Yes: > > 1. Configure/Restore rules > 2. Zero stats: ./ipvsadm -Z > 3. sleep 5 > 4. ./ipvsadm -ln --stats > >> vs3 ~/pkgs # ./ipvsadm -Z >> vs3 ~/pkgs # ./ipvsadm -ln --stats -u [snipped] >> Prot LocalAddress:Port Conns InPkts OutPkts InBytes OutBytes >> -> RemoteAddress:Port >> UDP x.x.x.120:514 0 12497040 0 2572M 0 >> -> x.x.x.65:514 0 3975 0 394171 0 > > It is really strange to have Conns=0 in stats. > I just tested OPS on 32-bit x86 (virtualbox) with plain > 3.10 kernel and there is no problem with CPS, Conns, etc, > WRR is scheduling according to weights. Do you have a > daemon that changes weights dynamically? I'm running x86_64 kernel. I compared kernel config of my two servers, a big difference between them is CONFIG_PREEMPT. While CONFIG_PREEMPT is disabled, trying plenty times of "ipvsadm -C && ipvsadm -R < rules-with-ops" will finally succeed, but with CONFIG_PREEMPT enabled it's too hard to get --ops work. I will test again on my "good" server another day to prove my guessing. Is there any good debug method for this? Tuning /proc/sys/net/ipv4/vs/debug_level didn't gave me much. I use keepalived to manage the ipvs configuration, but as vrrp heartbeat going on and no realserver up/down, it won't interact with ipvs, right? So I can temporarily modify ipvs rule via ipvsadm after keepalived started, and the modified rules didn't changed as time fly, so do the --ops setting. > More things to check: > > - if traffic stops check if some real server is hijacking the > traffic from director due to ARP problem in the real server. > Or explain how exactly OPS stops to work, do you see other > traffic for the VIP coming to director during such problem? > No possibility, I configured VIP on lo of realserver. for IP in $VIP; do ip addr add $IP/32 dev $VIP_NIC brd $IP done sysctl -q -w net.ipv4.conf.lo.arp_ignore=1 sysctl -q -w net.ipv4.conf.lo.arp_announce=2 sysctl -q -w net.ipv4.conf.all.arp_ignore=1 sysctl -q -w net.ipv4.conf.all.arp_announce=2 > - Build ipvsadm with 'make HAVE_NL=0' to check if Conns=0 problem > in --stats output is netlink related. This builds ipvsadm without > netlink support but use this binary only to see stats, not > for configuration. > > - show output from 'cat /proc/net/ip_vs_stats_percpu' to see > the kernel's stats and rates. Note that these stats are not > zeroed while stats in /proc/net/ip_vs_stats are zeroed. Always changing. vs3 ~ # cat /proc/net/ip_vs_stats_percpu Total Incoming Outgoing Incoming Outgoing CPU Conns Packets Packets Bytes Bytes 0 8F11751F 70455AB5 0 10AA672610D 0 1 1A780554 1A780554 0 E2AB71BCA 0 2 0 0 0 0 0 3 BF0E0B BF0E0B 0 4B7E409C 0 4 244BAF54 244BAF54 0 2224071265 0 5 2360B25C 2360B25B 0 1715A45DB3 0 6 0 0 0 0 0 7 E88FEF E88FEF 0 6ECC3067 0 8 1E2477AE 1E2477AE 0 12726CDE2E 0 9 10BD4D97 10BD4D97 0 A35650024 0 A BE81916 BE81914 0 6D9FD6CEF 0 B 4474D837 4474D836 0 3FCEC43B56 0 C 0 0 0 0 0 D 0 0 0 0 0 E 0 0 0 0 0 F 0 0 0 0 0 ~ 721BAF1B 534F94AD 0 1B61556B50B 0 Conns/s Pkts/s Pkts/s Bytes/s Bytes/s 1120F 1120F 0 C1FEB1 0 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ipvsadm: One-packet scheduling with UDP service is unstable 2013-08-24 11:14 ` Drunkard Zhang @ 2013-08-24 13:17 ` Julian Anastasov 2013-08-24 14:15 ` Drunkard Zhang 2013-08-26 2:07 ` Drunkard Zhang 0 siblings, 2 replies; 15+ messages in thread From: Julian Anastasov @ 2013-08-24 13:17 UTC (permalink / raw) To: Drunkard Zhang; +Cc: lvs-devel Hello, On Sat, 24 Aug 2013, Drunkard Zhang wrote: > I'm running x86_64 kernel. I compared kernel config of my two servers, > a big difference between them is CONFIG_PREEMPT. While CONFIG_PREEMPT > is disabled, trying plenty times of "ipvsadm -C && ipvsadm -R < > rules-with-ops" will finally succeed, but with CONFIG_PREEMPT enabled There is no "./" in above ipvsadm commands, I hope you put everything in scripts to make sure the new ipvsadm binary is used. > it's too hard to get --ops work. I will test again on my "good" server > another day to prove my guessing. My tests are on 32-bit UP, may be that is why I can not reproduce it. > Is there any good debug method for this? Tuning > /proc/sys/net/ipv4/vs/debug_level didn't gave me much. echo 20 > /proc/sys/net/ipv4/vs/debug_level should show something but don't do it for 60K packets/sec > I use keepalived to manage the ipvs configuration, but as vrrp > heartbeat going on and no realserver up/down, it won't interact with > ipvs, right? So I can temporarily modify ipvs rule via ipvsadm after > keepalived started, and the modified rules didn't changed as time fly, > so do the --ops setting. Yes, just make sure ops is present after the tests, in case some daemon removes the flag. > > More things to check: > > > > - if traffic stops check if some real server is hijacking the > > traffic from director due to ARP problem in the real server. > > Or explain how exactly OPS stops to work, do you see other > > traffic for the VIP coming to director during such problem? > > > No possibility, I configured VIP on lo of realserver. > for IP in $VIP; do > ip addr add $IP/32 dev $VIP_NIC brd $IP > done Setting these flags on "lo" is useless but "all" values should do the job, so ARP problem is solved. > sysctl -q -w net.ipv4.conf.lo.arp_ignore=1 > sysctl -q -w net.ipv4.conf.lo.arp_announce=2 > sysctl -q -w net.ipv4.conf.all.arp_ignore=1 > sysctl -q -w net.ipv4.conf.all.arp_announce=2 > > > - Build ipvsadm with 'make HAVE_NL=0' to check if Conns=0 problem > > in --stats output is netlink related. This builds ipvsadm without > > netlink support but use this binary only to see stats, not > > for configuration. > > > > - show output from 'cat /proc/net/ip_vs_stats_percpu' to see > > the kernel's stats and rates. Note that these stats are not > > zeroed while stats in /proc/net/ip_vs_stats are zeroed. > > Always changing. Even when OPS does not work? > vs3 ~ # cat /proc/net/ip_vs_stats_percpu > Total Incoming Outgoing Incoming Outgoing > CPU Conns Packets Packets Bytes Bytes > 0 8F11751F 70455AB5 0 10AA672610D 0 > 1 1A780554 1A780554 0 E2AB71BCA 0 > 2 0 0 0 0 0 > 3 BF0E0B BF0E0B 0 4B7E409C 0 > 4 244BAF54 244BAF54 0 2224071265 0 > 5 2360B25C 2360B25B 0 1715A45DB3 0 > 6 0 0 0 0 0 > 7 E88FEF E88FEF 0 6ECC3067 0 > 8 1E2477AE 1E2477AE 0 12726CDE2E 0 > 9 10BD4D97 10BD4D97 0 A35650024 0 > A BE81916 BE81914 0 6D9FD6CEF 0 > B 4474D837 4474D836 0 3FCEC43B56 0 > C 0 0 0 0 0 > D 0 0 0 0 0 > E 0 0 0 0 0 > F 0 0 0 0 0 > ~ 721BAF1B 534F94AD 0 1B61556B50B 0 > > Conns/s Pkts/s Pkts/s Bytes/s Bytes/s > 1120F 1120F 0 C1FEB1 0 So, to summarize for the both cases when OPS works and when OPS does not work: - you check after every rule restoring that the ops is present in kernel rules: cat /proc/net/ip_vs - in both cases traffic is received on director (no ARP problem): tcpdump -lnnn -i $INPUT_DEVICE -c 10 $VIP - cat /proc/net/ip_vs_stats_percpu in both cases shows that Conns for CPU "~" (Totals) are increasing and "Conns/s" rate is above 0. Help me to understand the Conns=0 and CPS=0 values in ipvsadm, they are showing 0 in both cases, right? - where do you see that OPS is not working? In ipvsadm -ln --stats/--rate ? Or packets do not reach real servers? Do you see that rates or stats for the real servers stop in ipvsadm output? May be we can enable debug for short time when OPS is not working: # Start debug for 10ms echo 20 > /proc/sys/net/ipv4/vs/debug_level usleep 10000 # Stop debug echo 0 > /proc/sys/net/ipv4/vs/debug_level You can show me such debug. The main thing to understand is where in IPVS the traffic is lost, the debug will be helpful, it should be no more than one page per packet. I need debug for one packet, something that you see is repeated in logs. May be due to the destination trash mechanism something is not set properly after the ipvsadm -C && ipvsadm -R sequence. Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ipvsadm: One-packet scheduling with UDP service is unstable 2013-08-24 13:17 ` Julian Anastasov @ 2013-08-24 14:15 ` Drunkard Zhang 2013-08-24 14:43 ` Julian Anastasov ` (2 more replies) 2013-08-26 2:07 ` Drunkard Zhang 1 sibling, 3 replies; 15+ messages in thread From: Drunkard Zhang @ 2013-08-24 14:15 UTC (permalink / raw) To: Julian Anastasov; +Cc: lvs-devel > May be we can enable debug for short time when > OPS is not working: > > # Start debug for 10ms > echo 20 > /proc/sys/net/ipv4/vs/debug_level > usleep 10000 > # Stop debug > echo 0 > /proc/sys/net/ipv4/vs/debug_level > > You can show me such debug. The main thing to > understand is where in IPVS the traffic is lost, the > debug will be helpful, it should be no more than one > page per packet. I need debug for one packet, something > that you see is repeated in logs. May be due to the > destination trash mechanism something is not set properly > after the ipvsadm -C && ipvsadm -R sequence. I'll provide those test result later. So, about debug, do you mean when I find OPS is not working, then cutoff the traffic, turn on debug with "echo 20 > /proc/sys/net/ipv4/vs/debug_level", and then emulate the traffic with a couple of packets, like this? logger -d -n $VIP "hello, this is test" Or, should I just enable debug for tiny time slice under high traffic? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ipvsadm: One-packet scheduling with UDP service is unstable 2013-08-24 14:15 ` Drunkard Zhang @ 2013-08-24 14:43 ` Julian Anastasov 2013-08-25 6:07 ` Julian Anastasov 2013-08-28 10:47 ` Jesper Dangaard Brouer 2 siblings, 0 replies; 15+ messages in thread From: Julian Anastasov @ 2013-08-24 14:43 UTC (permalink / raw) To: Drunkard Zhang; +Cc: lvs-devel Hello, On Sat, 24 Aug 2013, Drunkard Zhang wrote: > > May be we can enable debug for short time when > > OPS is not working: > > > > # Start debug for 10ms > > echo 20 > /proc/sys/net/ipv4/vs/debug_level > > usleep 10000 > > # Stop debug > > echo 0 > /proc/sys/net/ipv4/vs/debug_level > > > > You can show me such debug. The main thing to > > understand is where in IPVS the traffic is lost, the > > debug will be helpful, it should be no more than one > > page per packet. I need debug for one packet, something > > that you see is repeated in logs. May be due to the > > destination trash mechanism something is not set properly > > after the ipvsadm -C && ipvsadm -R sequence. > > I'll provide those test result later. > > So, about debug, do you mean when I find OPS is not working, then > cutoff the traffic, turn on debug with "echo 20 > > /proc/sys/net/ipv4/vs/debug_level", and then emulate the traffic with > a couple of packets, like this? > > logger -d -n $VIP "hello, this is test" > > Or, should I just enable debug for tiny time slice under high traffic? Whatever is your preference, it does not matter for me. With traffic you risk to flood log in director, for 10ms we get ~600 packets. I think, even usleep 1000 will work. And I'm not sure what will happen when many CPUs write to log. If you stop the traffic and send just one message under debug it would be more safe, if it is not a problem that your service is not available for some seconds. Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ipvsadm: One-packet scheduling with UDP service is unstable 2013-08-24 14:15 ` Drunkard Zhang 2013-08-24 14:43 ` Julian Anastasov @ 2013-08-25 6:07 ` Julian Anastasov 2013-08-28 10:47 ` Jesper Dangaard Brouer 2 siblings, 0 replies; 15+ messages in thread From: Julian Anastasov @ 2013-08-25 6:07 UTC (permalink / raw) To: Drunkard Zhang; +Cc: lvs-devel Hello, On Sat, 24 Aug 2013, Drunkard Zhang wrote: > I'll provide those test result later. In first email you said that traffic goes to single server, is it first in list or what is its position? Can you also check if sleep 1 solves the problem: ipvsadm -C && sleep 1 && ipvsadm -R < rules-with-ops Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ipvsadm: One-packet scheduling with UDP service is unstable 2013-08-24 14:15 ` Drunkard Zhang 2013-08-24 14:43 ` Julian Anastasov 2013-08-25 6:07 ` Julian Anastasov @ 2013-08-28 10:47 ` Jesper Dangaard Brouer 2 siblings, 0 replies; 15+ messages in thread From: Jesper Dangaard Brouer @ 2013-08-28 10:47 UTC (permalink / raw) To: Drunkard Zhang; +Cc: Julian Anastasov, lvs-devel, brouer On Sat, 24 Aug 2013 22:15:16 +0800 Drunkard Zhang <gongfan193@gmail.com> wrote: > > May be we can enable debug for short time when > > OPS is not working: > > > > # Start debug for 10ms > > echo 20 > /proc/sys/net/ipv4/vs/debug_level > > usleep 10000 > > # Stop debug > > echo 0 > /proc/sys/net/ipv4/vs/debug_level Just some general notes on debugging. Debug verbose levels: - 1 is least verbose - 12 is most verbose - 0 disables debugging Kernel needs to be compiled with CONFIG_IP_VS_DEBUG=y To get even more detailed debugging. The macro IP_VS_DBG_PKT uses debug_pr() to print its messages. If the kernel is configured with dynamic debug (CONFIG_DYNAMIC_DEBUG), you will not see these messages. To enable these messages, do the following: mount -t debugfs none /sys/kernel/debug/ cat /sys/kernel/debug/dynamic_debug/control | grep ipvs echo "func ip_vs_tcpudp_debug_packet_v6 +p" > /sys/kernel/debug/dynamic_debug/control echo "func ip_vs_tcpudp_debug_packet_v4 +p" > /sys/kernel/debug/dynamic_debug/control -- Best regards, Jesper Dangaard Brouer MSc.CS, Sr. Network Kernel Developer at Red Hat Author of http://www.iptv-analyzer.org LinkedIn: http://www.linkedin.com/in/brouer ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ipvsadm: One-packet scheduling with UDP service is unstable 2013-08-24 13:17 ` Julian Anastasov 2013-08-24 14:15 ` Drunkard Zhang @ 2013-08-26 2:07 ` Drunkard Zhang 2013-08-26 3:37 ` Drunkard Zhang 1 sibling, 1 reply; 15+ messages in thread From: Drunkard Zhang @ 2013-08-26 2:07 UTC (permalink / raw) To: Julian Anastasov; +Cc: lvs-devel [-- Attachment #1: Type: text/plain, Size: 8041 bytes --] 2013/8/24 Julian Anastasov <ja@ssi.bg>: > > Hello, > > On Sat, 24 Aug 2013, Drunkard Zhang wrote: > >> I'm running x86_64 kernel. I compared kernel config of my two servers, >> a big difference between them is CONFIG_PREEMPT. While CONFIG_PREEMPT >> is disabled, trying plenty times of "ipvsadm -C && ipvsadm -R < >> rules-with-ops" will finally succeed, but with CONFIG_PREEMPT enabled > > There is no "./" in above ipvsadm commands, > I hope you put everything in scripts to make sure > the new ipvsadm binary is used. > >> it's too hard to get --ops work. I will test again on my "good" server >> another day to prove my guessing. > > My tests are on 32-bit UP, may be that is why I can > not reproduce it. > >> Is there any good debug method for this? Tuning >> /proc/sys/net/ipv4/vs/debug_level didn't gave me much. > > echo 20 > /proc/sys/net/ipv4/vs/debug_level > > should show something but don't do it for > 60K packets/sec > >> I use keepalived to manage the ipvs configuration, but as vrrp >> heartbeat going on and no realserver up/down, it won't interact with >> ipvs, right? So I can temporarily modify ipvs rule via ipvsadm after >> keepalived started, and the modified rules didn't changed as time fly, >> so do the --ops setting. > > Yes, just make sure ops is present after the tests, > in case some daemon removes the flag. > >> > More things to check: >> > >> > - if traffic stops check if some real server is hijacking the >> > traffic from director due to ARP problem in the real server. >> > Or explain how exactly OPS stops to work, do you see other >> > traffic for the VIP coming to director during such problem? >> > >> No possibility, I configured VIP on lo of realserver. >> for IP in $VIP; do >> ip addr add $IP/32 dev $VIP_NIC brd $IP >> done > > Setting these flags on "lo" is useless but > "all" values should do the job, so ARP problem is > solved. > >> sysctl -q -w net.ipv4.conf.lo.arp_ignore=1 >> sysctl -q -w net.ipv4.conf.lo.arp_announce=2 >> sysctl -q -w net.ipv4.conf.all.arp_ignore=1 >> sysctl -q -w net.ipv4.conf.all.arp_announce=2 >> >> > - Build ipvsadm with 'make HAVE_NL=0' to check if Conns=0 problem >> > in --stats output is netlink related. This builds ipvsadm without >> > netlink support but use this binary only to see stats, not >> > for configuration. >> > >> > - show output from 'cat /proc/net/ip_vs_stats_percpu' to see >> > the kernel's stats and rates. Note that these stats are not >> > zeroed while stats in /proc/net/ip_vs_stats are zeroed. >> >> Always changing. > > Even when OPS does not work? > >> vs3 ~ # cat /proc/net/ip_vs_stats_percpu >> Total Incoming Outgoing Incoming Outgoing >> CPU Conns Packets Packets Bytes Bytes >> 0 8F11751F 70455AB5 0 10AA672610D 0 >> 1 1A780554 1A780554 0 E2AB71BCA 0 >> 2 0 0 0 0 0 >> 3 BF0E0B BF0E0B 0 4B7E409C 0 >> 4 244BAF54 244BAF54 0 2224071265 0 >> 5 2360B25C 2360B25B 0 1715A45DB3 0 >> 6 0 0 0 0 0 >> 7 E88FEF E88FEF 0 6ECC3067 0 >> 8 1E2477AE 1E2477AE 0 12726CDE2E 0 >> 9 10BD4D97 10BD4D97 0 A35650024 0 >> A BE81916 BE81914 0 6D9FD6CEF 0 >> B 4474D837 4474D836 0 3FCEC43B56 0 >> C 0 0 0 0 0 >> D 0 0 0 0 0 >> E 0 0 0 0 0 >> F 0 0 0 0 0 >> ~ 721BAF1B 534F94AD 0 1B61556B50B 0 >> >> Conns/s Pkts/s Pkts/s Bytes/s Bytes/s >> 1120F 1120F 0 C1FEB1 0 > > So, to summarize for the both cases when OPS > works and when OPS does not work: > > - you check after every rule restoring that the ops is > present in kernel rules: cat /proc/net/ip_vs Sure, ops is always there. > - in both cases traffic is received on director (no ARP > problem): tcpdump -lnnn -i $INPUT_DEVICE -c 10 $VIP Also sure. > - cat /proc/net/ip_vs_stats_percpu in both cases shows > that Conns for CPU "~" (Totals) are increasing and "Conns/s" > rate is above 0. Help me to understand the Conns=0 and CPS=0 > values in ipvsadm, they are showing 0 in both cases, > right? Badly, Conns is a fixed number and Conns/s is zero. vs3 ~ # cat /proc/net/ip_vs_stats_percpu Total Incoming Outgoing Incoming Outgoing CPU Conns Packets Packets Bytes Bytes 0 12 3C3C98F 0 2CB498261 0 1 0 54324B 0 4BEFFBB9 0 2 0 50C2 0 1F37E8 0 3 0 0 0 0 0 4 0 0 0 0 0 5 0 1BC7A 0 1635AF3 0 6 0 31A7BE 0 1CDC43C7 0 7 0 2B4E76 0 1BF498BE 0 8 0 1D418 0 B86A6E 0 9 0 5B49E8 0 54FD74D5 0 A 0 75147D 0 410A95C0 0 B 0 0 0 0 0 C 0 BD570 0 49118C5 0 D 0 0 0 0 0 E 0 211948 0 138B63C5 0 F 0 626075 0 402A470B 0 ~ 12 5D7664E 0 43FC611F8 0 Conns/s Pkts/s Pkts/s Bytes/s Bytes/s 0 EF93 0 AE2DD9 0 > - where do you see that OPS is not working? In > ipvsadm -ln --stats/--rate ? Or packets do not > reach real servers? Do you see that rates or stats > for the real servers stop in ipvsadm output? I'm sure OPS is not working, both from ipvsadm -ln --stats/--rate and iftop -i eth0 -f "udp port 514" on real server. There's no ingress traffice at all when InPPS/InBPS from --rate is 0, but OPS is set. > May be we can enable debug for short time when > OPS is not working: > > # Start debug for 10ms > echo 20 > /proc/sys/net/ipv4/vs/debug_level > usleep 10000 > # Stop debug > echo 0 > /proc/sys/net/ipv4/vs/debug_level > > You can show me such debug. The main thing to > understand is where in IPVS the traffic is lost, the > debug will be helpful, it should be no more than one > page per packet. I need debug for one packet, something > that you see is repeated in logs. May be due to the > destination trash mechanism something is not set properly > after the ipvsadm -C && ipvsadm -R sequence. sleep 1 does not help with `ipvsadm -C && sleep 1 && ipvsadm -R < rules-with-ops`. Debug log is attached. bad-20130826-init.gz is produced by: ./ipvsadm -C # Clear previous log > /var/log/kern.log sleep 3 # Start debug echo 20 > /proc/sys/net/ipv4/vs/debug_level ./ipvsadm -R < /etc/keepalived/rules-with-ops usleep 10000 # Stop debug echo 0 > /proc/sys/net/ipv4/vs/debug_level bad-20130826-running.gz and good-20130825-running.gz is produced by: # Start debug for 10ms echo 20 > /proc/sys/net/ipv4/vs/debug_level usleep 10000 # Stop debug echo 0 > /proc/sys/net/ipv4/vs/debug_level good-20130825-running.gz is captured when OPS working. I noticed that after long time of running (ops is configured but not working), like 24 hours, restore the rules again, it may works sometimes. But with newly started kernel, it's just too hard to get ops working. [-- Attachment #2: bad-20130826-init.gz --] [-- Type: application/x-gzip, Size: 23592 bytes --] [-- Attachment #3: bad-20130826-running.gz --] [-- Type: application/x-gzip, Size: 9375 bytes --] [-- Attachment #4: good-20130825-running.gz --] [-- Type: application/x-gzip, Size: 11549 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ipvsadm: One-packet scheduling with UDP service is unstable 2013-08-26 2:07 ` Drunkard Zhang @ 2013-08-26 3:37 ` Drunkard Zhang 2013-08-26 9:55 ` Julian Anastasov 0 siblings, 1 reply; 15+ messages in thread From: Drunkard Zhang @ 2013-08-26 3:37 UTC (permalink / raw) To: Julian Anastasov; +Cc: lvs-devel Good news, I finally found the crap source, it's keepalived. I tested several times without keepalived in runlevel 3, after kernel boots I add the ipvs service by hand: ./ipvsadm -C # Clear previous log > /var/log/kern.log sleep 1 # Start debug echo 20 > /proc/sys/net/ipv4/vs/debug_level ./ipvsadm -R < /etc/keepalived/rules-with-ops usleep 30000 # Stop debug echo 0 > /proc/sys/net/ipv4/vs/debug_level Then add VIP manually, then do ARP announce manually: vs3 ~/pkgs # ip a add 150.164.100.120/32 dev eno1 vs3 ~/pkgs # arp-sk -i eno1 -S 150.164.100.120:90:b1:1c:1a:59:46 -d 150.164.100.126 After these actions, traffic starts come in. and all ipvsadm checks are fine, OPS is fine too. So I figured that maybe outdated libipvs in keepalived broke the ipvs in kernel. I'll try to report this to upstream. On the other hand, ipvs didn't recovery from ipvsadm -C, rmmod ip_vs && ./ipvsadm -R < rules-with-ops is needed (I tested, reload ip_vs module could make OPS work). So robustness of IPVS needs improvement. Thanks for all your help and kindness :-) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ipvsadm: One-packet scheduling with UDP service is unstable 2013-08-26 3:37 ` Drunkard Zhang @ 2013-08-26 9:55 ` Julian Anastasov 2013-08-27 3:20 ` Drunkard Zhang 0 siblings, 1 reply; 15+ messages in thread From: Julian Anastasov @ 2013-08-26 9:55 UTC (permalink / raw) To: Drunkard Zhang; +Cc: lvs-devel Hello, On Mon, 26 Aug 2013, Drunkard Zhang wrote: > Good news, I finally found the crap source, it's keepalived. I tested > several times without keepalived in runlevel 3, after kernel boots I > add the ipvs service by hand: OK, I was worried that my recent RCU changes broke something in the WRR scheduler and the configuration process. > ./ipvsadm -C > # Clear previous log > > /var/log/kern.log > sleep 1 > # Start debug > echo 20 > /proc/sys/net/ipv4/vs/debug_level > ./ipvsadm -R < /etc/keepalived/rules-with-ops > usleep 30000 > # Stop debug > echo 0 > /proc/sys/net/ipv4/vs/debug_level > > Then add VIP manually, then do ARP announce manually: > vs3 ~/pkgs # ip a add 150.164.100.120/32 dev eno1 > vs3 ~/pkgs # arp-sk -i eno1 -S 150.164.100.120:90:b1:1c:1a:59:46 -d > 150.164.100.126 > > After these actions, traffic starts come in. and all ipvsadm checks > are fine, OPS is fine too. So I figured that maybe outdated libipvs in > keepalived broke the ipvs in kernel. I'll try to report this to > upstream. OK, I have no more doubts. To summarize, here is what I think happened: - packet is scheduled while there is virtual service without the --ops flag. The result is that an UDP connection is created that expires after 5mins by default, if there are no more packets. - traffic is not stopped, it hits the connection and restarts its timer. As result, this connection stays forever and forwards traffic to single server. - as single connection is used we see that the stats for Conns and CPS rate do not move because we do not create connections anymore, all traffic comes from single client address and the scheduler is not called. - there is one variation here: ipvsadm -C is called, dests are moved to the trash list, new rules are added but before the RCU grace period is expired. In such case IP_VS_DEST_STATE_REMOVING is still set and prevents the same dest to be reused when adding the same dest parameters. In this case the connection will point to unavailable dest for 5mins and the traffic that hits it will not restart its timer. After 5mins the connection will be removed and the first packet that comes will use the --ops flag. There is a chance everything to work. So, if new rules are added we have 2 situations: 1. rules reuse old dests and traffic goes to single server. This happens if the new rules are added after at least 10ms (the RCU grace period, in fact), eg. with usleep 10000 after ipvsadm -C. We have CPS=0 and InPPS above 0 for single server. 2. rules allocate new dest and traffic is stopped for 5mins. This will happen if rules are added immediately after ipvsadm -C (while in RCU grace period). After 5mins everything works. - CPS 0 means we are reusing existing connection - even if you replace the service or set --ops, the existing connection is still used, even ipvsadm -C can not remove it. There is only one chance: to set expire_nodest_conn=1, to call ipvsadm -C and to wait next packet to remove the connection. Then to add all rules again but not before the connection is removed. > On the other hand, ipvs didn't recovery from ipvsadm -C, rmmod ip_vs > && ./ipvsadm -R < rules-with-ops is needed (I tested, reload ip_vs > module could make OPS work). So robustness of IPVS needs improvement. Some problem? May be you refer to the fact that connections survive ipvsadm -C and that is what prevented your traffic to be scheduled. So, I see two problems here: - tools do not set --ops, connection is created and is reused from all packets from same client. The trick to add --ops later can not work. Idea: drop traffic before reaching IPVS (-j DROP) until --ops is applied, by this way no connections should be created. - no way to flush connections in IPVS without removing the module because expire_nodest_conn works only when traffic is received. I think, your above remark points here. Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: ipvsadm: One-packet scheduling with UDP service is unstable 2013-08-26 9:55 ` Julian Anastasov @ 2013-08-27 3:20 ` Drunkard Zhang 0 siblings, 0 replies; 15+ messages in thread From: Drunkard Zhang @ 2013-08-27 3:20 UTC (permalink / raw) To: Julian Anastasov; +Cc: lvs-devel 2013/8/26 Julian Anastasov <ja@ssi.bg>: > On Mon, 26 Aug 2013, Drunkard Zhang wrote: > >> Good news, I finally found the crap source, it's keepalived. I tested >> several times without keepalived in runlevel 3, after kernel boots I >> add the ipvs service by hand: > > OK, I was worried that my recent RCU changes broke > something in the WRR scheduler and the configuration process. > >> ./ipvsadm -C >> # Clear previous log >> > /var/log/kern.log >> sleep 1 >> # Start debug >> echo 20 > /proc/sys/net/ipv4/vs/debug_level >> ./ipvsadm -R < /etc/keepalived/rules-with-ops >> usleep 30000 >> # Stop debug >> echo 0 > /proc/sys/net/ipv4/vs/debug_level >> >> Then add VIP manually, then do ARP announce manually: >> vs3 ~/pkgs # ip a add 150.164.100.120/32 dev eno1 >> vs3 ~/pkgs # arp-sk -i eno1 -S 150.164.100.120:90:b1:1c:1a:59:46 -d >> 150.164.100.126 >> >> After these actions, traffic starts come in. and all ipvsadm checks >> are fine, OPS is fine too. So I figured that maybe outdated libipvs in >> keepalived broke the ipvs in kernel. I'll try to report this to >> upstream. > > OK, I have no more doubts. To summarize, > here is what I think happened: > > - packet is scheduled while there is virtual service without > the --ops flag. The result is that an UDP connection is > created that expires after 5mins by default, if there are > no more packets. > > - traffic is not stopped, it hits the connection and > restarts its timer. As result, this connection stays > forever and forwards traffic to single server. This explains why expire time from "ipvsadm -lcn" keeps at 5.00min. > - as single connection is used we see that the stats for > Conns and CPS rate do not move because we do not create > connections anymore, all traffic comes from single client > address and the scheduler is not called. > > - there is one variation here: ipvsadm -C is called, > dests are moved to the trash list, new rules are > added but before the RCU grace period is expired. > In such case IP_VS_DEST_STATE_REMOVING is still set and > prevents the same dest to be reused when adding the > same dest parameters. In this case the connection will point > to unavailable dest for 5mins and the traffic that hits it > will not restart its timer. After 5mins the connection > will be removed and the first packet that comes > will use the --ops flag. There is a chance everything > to work. So, if new rules are added we have 2 > situations: > > 1. rules reuse old dests and traffic goes to single server. > This happens if the new rules are added after at least > 10ms (the RCU grace period, in fact), eg. with > usleep 10000 after ipvsadm -C. We have CPS=0 and > InPPS above 0 for single server. > > 2. rules allocate new dest and traffic is stopped > for 5mins. This will happen if rules are added > immediately after ipvsadm -C (while in RCU grace period). > After 5mins everything works. > > - CPS 0 means we are reusing existing connection > > - even if you replace the service or set --ops, the > existing connection is still used, even ipvsadm -C > can not remove it. There is only one chance: to set > expire_nodest_conn=1, to call ipvsadm -C and to wait > next packet to remove the connection. Then to add > all rules again but not before the connection is removed. > >> On the other hand, ipvs didn't recovery from ipvsadm -C, rmmod ip_vs >> && ./ipvsadm -R < rules-with-ops is needed (I tested, reload ip_vs >> module could make OPS work). So robustness of IPVS needs improvement. > > Some problem? May be you refer to the fact that > connections survive ipvsadm -C and that is what prevented > your traffic to be scheduled. > > So, I see two problems here: > > - tools do not set --ops, connection is created and is > reused from all packets from same client. The trick > to add --ops later can not work. Idea: drop traffic > before reaching IPVS (-j DROP) until --ops is applied, > by this way no connections should be created. > > - no way to flush connections in IPVS without removing the > module because expire_nodest_conn works only when traffic is > received. I think, your above remark points here. Again, thanks for your explanation, now I understand all these "weird" things, it's all because of not supporting --ops by keepalived. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2013-08-28 10:47 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-20 15:06 ipvsadm: One-packet scheduling with UDP service is unstable Drunkard Zhang
2013-08-22 6:43 ` Julian Anastasov
2013-08-22 10:58 ` Drunkard Zhang
2013-08-22 14:14 ` Julian Anastasov
2013-08-22 23:24 ` Drunkard Zhang
[not found] ` <alpine.LFD.2.00.1308231708450.1852@ja.ssi.bg>
2013-08-24 11:14 ` Drunkard Zhang
2013-08-24 13:17 ` Julian Anastasov
2013-08-24 14:15 ` Drunkard Zhang
2013-08-24 14:43 ` Julian Anastasov
2013-08-25 6:07 ` Julian Anastasov
2013-08-28 10:47 ` Jesper Dangaard Brouer
2013-08-26 2:07 ` Drunkard Zhang
2013-08-26 3:37 ` Drunkard Zhang
2013-08-26 9:55 ` Julian Anastasov
2013-08-27 3:20 ` Drunkard Zhang
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.