Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v2 17/17] net: WireGuard secure network tunnel
From: kbuild test robot @ 2018-08-27 12:52 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: kbuild-all, linux-kernel, netdev, davem, Jason A. Donenfeld,
	Greg KH
In-Reply-To: <20180824213849.23647-18-Jason@zx2c4.com>

[-- Attachment #1: Type: text/plain, Size: 42510 bytes --]

Hi Jason,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.19-rc1 next-20180827]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jason-A-Donenfeld/WireGuard-Secure-Network-Tunnel/20180827-073051
config: um-allmodconfig (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=um 

All error/warnings (new ones prefixed by >>):

   In file included from lib/zinc/chacha20/chacha20-x86_64-glue.h:8:0,
                    from <command-line>:0:
>> arch/x86/include/asm/cpufeature.h:134:34: warning: 'struct cpuinfo_x86' declared inside parameter list will not be visible outside of this definition or declaration
    extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);
                                     ^~~~~~~~~~~
   In file included from include/linux/compiler_types.h:64:0,
                    from <command-line>:0:
   arch/x86/include/asm/cpufeature.h: In function '_static_cpu_has':
>> arch/x86/include/asm/cpufeature.h:198:52: error: 'struct cpuinfo_um' has no member named 'x86_capability'
           [cap_byte] "m" (((const char *)boot_cpu_data.x86_capability)[bit >> 3])
                                                       ^
   include/linux/compiler-gcc.h:182:47: note: in definition of macro 'asm_volatile_goto'
    #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
                                                  ^
   In file included from <command-line>:0:0:
   At top level:
   lib/zinc/chacha20/chacha20-x86_64-glue.h:27:13: warning: 'chacha20_use_avx512vl' defined but not used [-Wunused-variable]
    static bool chacha20_use_avx512vl __ro_after_init;
                ^~~~~~~~~~~~~~~~~~~~~
   lib/zinc/chacha20/chacha20-x86_64-glue.h:26:13: warning: 'chacha20_use_avx512' defined but not used [-Wunused-variable]
    static bool chacha20_use_avx512 __ro_after_init;
                ^~~~~~~~~~~~~~~~~~~
   lib/zinc/chacha20/chacha20-x86_64-glue.h:25:13: warning: 'chacha20_use_avx2' defined but not used [-Wunused-variable]
    static bool chacha20_use_avx2 __ro_after_init;
                ^~~~~~~~~~~~~~~~~
   lib/zinc/chacha20/chacha20-x86_64-glue.h:24:13: warning: 'chacha20_use_ssse3' defined but not used [-Wunused-variable]
    static bool chacha20_use_ssse3 __ro_after_init;
                ^~~~~~~~~~~~~~~~~~
--
   In file included from lib/zinc/poly1305/poly1305-x86_64-glue.h:8:0,
                    from <command-line>:0:
>> arch/x86/include/asm/cpufeature.h:134:34: warning: 'struct cpuinfo_x86' declared inside parameter list will not be visible outside of this definition or declaration
    extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);
                                     ^~~~~~~~~~~
   In file included from include/linux/compiler_types.h:64:0,
                    from <command-line>:0:
   arch/x86/include/asm/cpufeature.h: In function '_static_cpu_has':
>> arch/x86/include/asm/cpufeature.h:198:52: error: 'struct cpuinfo_um' has no member named 'x86_capability'
           [cap_byte] "m" (((const char *)boot_cpu_data.x86_capability)[bit >> 3])
                                                       ^
   include/linux/compiler-gcc.h:182:47: note: in definition of macro 'asm_volatile_goto'
    #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
                                                  ^
   In file included from <command-line>:0:0:
   At top level:
   lib/zinc/poly1305/poly1305-x86_64-glue.h:28:13: warning: 'poly1305_use_avx512' defined but not used [-Wunused-variable]
    static bool poly1305_use_avx512 __ro_after_init;
                ^~~~~~~~~~~~~~~~~~~
   lib/zinc/poly1305/poly1305-x86_64-glue.h:27:13: warning: 'poly1305_use_avx2' defined but not used [-Wunused-variable]
    static bool poly1305_use_avx2 __ro_after_init;
                ^~~~~~~~~~~~~~~~~
   lib/zinc/poly1305/poly1305-x86_64-glue.h:26:13: warning: 'poly1305_use_avx' defined but not used [-Wunused-variable]
    static bool poly1305_use_avx __ro_after_init;
                ^~~~~~~~~~~~~~~~
--
   In file included from lib/zinc/curve25519/curve25519-x86_64-glue.h:7:0,
                    from <command-line>:0:
>> arch/x86/include/asm/cpufeature.h:49:41: error: 'NBUGINTS' undeclared here (not in a function)
    extern const char * const x86_bug_flags[NBUGINTS*32];
                                            ^~~~~~~~
>> arch/x86/include/asm/cpufeature.h:134:34: warning: 'struct cpuinfo_x86' declared inside parameter list will not be visible outside of this definition or declaration
    extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);
                                     ^~~~~~~~~~~
   In file included from include/linux/compiler_types.h:64:0,
                    from <command-line>:0:
   arch/x86/include/asm/cpufeature.h: In function '_static_cpu_has':
>> arch/x86/include/asm/cpufeature.h:196:24: error: 'X86_FEATURE_ALWAYS' undeclared (first use in this function)
           [always]   "i" (X86_FEATURE_ALWAYS),
                           ^
   include/linux/compiler-gcc.h:182:47: note: in definition of macro 'asm_volatile_goto'
    #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
                                                  ^
   arch/x86/include/asm/cpufeature.h:196:24: note: each undeclared identifier is reported only once for each function it appears in
           [always]   "i" (X86_FEATURE_ALWAYS),
                           ^
   include/linux/compiler-gcc.h:182:47: note: in definition of macro 'asm_volatile_goto'
    #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
                                                  ^
>> arch/x86/include/asm/cpufeature.h:198:52: error: 'struct cpuinfo_um' has no member named 'x86_capability'
           [cap_byte] "m" (((const char *)boot_cpu_data.x86_capability)[bit >> 3])
                                                       ^
   include/linux/compiler-gcc.h:182:47: note: in definition of macro 'asm_volatile_goto'
    #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
                                                  ^
   In file included from lib/zinc/curve25519/curve25519-x86_64-glue.h:10:0,
                    from <command-line>:0:
   lib/zinc/curve25519/curve25519-x86_64.h: In function 'inv_eltfp25519_1w_adx':
>> lib/zinc/curve25519/curve25519-x86_64.h:1543:2: error: implicit declaration of function 'memzero_explicit' [-Werror=implicit-function-declaration]
     memzero_explicit(&m, sizeof(m));
     ^~~~~~~~~~~~~~~~
   In file included from lib/zinc/curve25519/curve25519-x86_64-glue.h:10:0,
                    from <command-line>:0:
   lib/zinc/curve25519/curve25519-x86_64.h: In function 'curve25519_adx':
>> lib/zinc/curve25519/curve25519-x86_64.h:1706:2: error: implicit declaration of function 'memcpy'; did you mean 'pte_copy'? [-Werror=implicit-function-declaration]
     memcpy(m.private, private_key, sizeof(m.private));
     ^~~~~~
     pte_copy
   In file included from <command-line>:0:0:
   lib/zinc/curve25519/curve25519-x86_64-glue.h: At top level:
>> lib/zinc/curve25519/curve25519-x86_64-glue.h:12:33: error: expected '=', ',', ';', 'asm' or '__attribute__' before '__ro_after_init'
    static bool curve25519_use_bmi2 __ro_after_init;
                                    ^~~~~~~~~~~~~~~
   lib/zinc/curve25519/curve25519-x86_64-glue.h:13:32: error: expected '=', ',', ';', 'asm' or '__attribute__' before '__ro_after_init'
    static bool curve25519_use_adx __ro_after_init;
                                   ^~~~~~~~~~~~~~~
>> lib/zinc/curve25519/curve25519-x86_64-glue.h:15:13: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'curve25519_fpu_init'
    void __init curve25519_fpu_init(void)
                ^~~~~~~~~~~~~~~~~~~
   In file included from <command-line>:0:0:
   lib/zinc/curve25519/curve25519-x86_64-glue.h: In function 'curve25519_arch':
>> lib/zinc/curve25519/curve25519-x86_64-glue.h:23:6: error: 'curve25519_use_adx' undeclared (first use in this function); did you mean 'curve25519_adx'?
     if (curve25519_use_adx) {
         ^~~~~~~~~~~~~~~~~~
         curve25519_adx
>> lib/zinc/curve25519/curve25519-x86_64-glue.h:26:13: error: 'curve25519_use_bmi2' undeclared (first use in this function); did you mean 'curve25519_use_adx'?
     } else if (curve25519_use_bmi2) {
                ^~~~~~~~~~~~~~~~~~~
                curve25519_use_adx
   lib/zinc/curve25519/curve25519-x86_64-glue.h: In function 'curve25519_base_arch':
   lib/zinc/curve25519/curve25519-x86_64-glue.h:35:6: error: 'curve25519_use_adx' undeclared (first use in this function); did you mean 'curve25519_adx'?
     if (curve25519_use_adx) {
         ^~~~~~~~~~~~~~~~~~
         curve25519_adx
   lib/zinc/curve25519/curve25519-x86_64-glue.h:38:13: error: 'curve25519_use_bmi2' undeclared (first use in this function); did you mean 'curve25519_use_adx'?
     } else if (curve25519_use_bmi2) {
                ^~~~~~~~~~~~~~~~~~~
                curve25519_use_adx
   In file included from arch/x86/include/asm/string.h:5:0,
                    from include/linux/string.h:20,
                    from lib/zinc/curve25519/curve25519.c:9:
   arch/x86/include/asm/string_64.h: At top level:
>> arch/x86/include/asm/string_64.h:32:14: error: conflicting types for 'memcpy'
    extern void *memcpy(void *to, const void *from, size_t len);
                 ^~~~~~
   In file included from lib/zinc/curve25519/curve25519-x86_64-glue.h:10:0,
                    from <command-line>:0:
   lib/zinc/curve25519/curve25519-x86_64.h:1706:2: note: previous implicit declaration of 'memcpy' was here
     memcpy(m.private, private_key, sizeof(m.private));
     ^~~~~~
   In file included from lib/zinc/curve25519/curve25519.c:9:0:
>> include/linux/string.h:216:6: warning: conflicting types for 'memzero_explicit'
    void memzero_explicit(void *s, size_t count);
         ^~~~~~~~~~~~~~~~
   In file included from lib/zinc/curve25519/curve25519-x86_64-glue.h:10:0,
                    from <command-line>:0:
   lib/zinc/curve25519/curve25519-x86_64.h:1543:2: note: previous implicit declaration of 'memzero_explicit' was here
     memzero_explicit(&m, sizeof(m));
     ^~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors
--
   In file included from lib/zinc/blake2s/blake2s-x86_64-glue.h:7:0,
                    from <command-line>:0:
>> arch/x86/include/asm/cpufeature.h:134:34: warning: 'struct cpuinfo_x86' declared inside parameter list will not be visible outside of this definition or declaration
    extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);
                                     ^~~~~~~~~~~
   In file included from include/linux/compiler_types.h:64:0,
                    from <command-line>:0:
   arch/x86/include/asm/cpufeature.h: In function '_static_cpu_has':
>> arch/x86/include/asm/cpufeature.h:198:52: error: 'struct cpuinfo_um' has no member named 'x86_capability'
           [cap_byte] "m" (((const char *)boot_cpu_data.x86_capability)[bit >> 3])
                                                       ^
   include/linux/compiler-gcc.h:182:47: note: in definition of macro 'asm_volatile_goto'
    #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
                                                  ^
   In file included from <command-line>:0:0:
   At top level:
   lib/zinc/blake2s/blake2s-x86_64-glue.h:20:13: warning: 'blake2s_use_avx512' defined but not used [-Wunused-variable]
    static bool blake2s_use_avx512 __ro_after_init;
                ^~~~~~~~~~~~~~~~~~
   lib/zinc/blake2s/blake2s-x86_64-glue.h:19:13: warning: 'blake2s_use_avx' defined but not used [-Wunused-variable]
    static bool blake2s_use_avx __ro_after_init;
                ^~~~~~~~~~~~~~~
--
   In file included from lib//zinc/chacha20/chacha20-x86_64-glue.h:8:0,
                    from <command-line>:0:
>> arch/x86/include/asm/cpufeature.h:134:34: warning: 'struct cpuinfo_x86' declared inside parameter list will not be visible outside of this definition or declaration
    extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);
                                     ^~~~~~~~~~~
   In file included from include/linux/compiler_types.h:64:0,
                    from <command-line>:0:
   arch/x86/include/asm/cpufeature.h: In function '_static_cpu_has':
>> arch/x86/include/asm/cpufeature.h:198:52: error: 'struct cpuinfo_um' has no member named 'x86_capability'
           [cap_byte] "m" (((const char *)boot_cpu_data.x86_capability)[bit >> 3])
                                                       ^
   include/linux/compiler-gcc.h:182:47: note: in definition of macro 'asm_volatile_goto'
    #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
                                                  ^
   In file included from <command-line>:0:0:
   At top level:
   lib//zinc/chacha20/chacha20-x86_64-glue.h:27:13: warning: 'chacha20_use_avx512vl' defined but not used [-Wunused-variable]
    static bool chacha20_use_avx512vl __ro_after_init;
                ^~~~~~~~~~~~~~~~~~~~~
   lib//zinc/chacha20/chacha20-x86_64-glue.h:26:13: warning: 'chacha20_use_avx512' defined but not used [-Wunused-variable]
    static bool chacha20_use_avx512 __ro_after_init;
                ^~~~~~~~~~~~~~~~~~~
   lib//zinc/chacha20/chacha20-x86_64-glue.h:25:13: warning: 'chacha20_use_avx2' defined but not used [-Wunused-variable]
    static bool chacha20_use_avx2 __ro_after_init;
                ^~~~~~~~~~~~~~~~~
   lib//zinc/chacha20/chacha20-x86_64-glue.h:24:13: warning: 'chacha20_use_ssse3' defined but not used [-Wunused-variable]
    static bool chacha20_use_ssse3 __ro_after_init;
                ^~~~~~~~~~~~~~~~~~
--
   In file included from lib//zinc/poly1305/poly1305-x86_64-glue.h:8:0,
                    from <command-line>:0:
>> arch/x86/include/asm/cpufeature.h:134:34: warning: 'struct cpuinfo_x86' declared inside parameter list will not be visible outside of this definition or declaration
    extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);
                                     ^~~~~~~~~~~
   In file included from include/linux/compiler_types.h:64:0,
                    from <command-line>:0:
   arch/x86/include/asm/cpufeature.h: In function '_static_cpu_has':
>> arch/x86/include/asm/cpufeature.h:198:52: error: 'struct cpuinfo_um' has no member named 'x86_capability'
           [cap_byte] "m" (((const char *)boot_cpu_data.x86_capability)[bit >> 3])
                                                       ^
   include/linux/compiler-gcc.h:182:47: note: in definition of macro 'asm_volatile_goto'
    #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
                                                  ^
   In file included from <command-line>:0:0:
   At top level:
   lib//zinc/poly1305/poly1305-x86_64-glue.h:28:13: warning: 'poly1305_use_avx512' defined but not used [-Wunused-variable]
    static bool poly1305_use_avx512 __ro_after_init;
                ^~~~~~~~~~~~~~~~~~~
   lib//zinc/poly1305/poly1305-x86_64-glue.h:27:13: warning: 'poly1305_use_avx2' defined but not used [-Wunused-variable]
    static bool poly1305_use_avx2 __ro_after_init;
                ^~~~~~~~~~~~~~~~~
   lib//zinc/poly1305/poly1305-x86_64-glue.h:26:13: warning: 'poly1305_use_avx' defined but not used [-Wunused-variable]
    static bool poly1305_use_avx __ro_after_init;
                ^~~~~~~~~~~~~~~~
--
   In file included from lib//zinc/curve25519/curve25519-x86_64-glue.h:7:0,
                    from <command-line>:0:
>> arch/x86/include/asm/cpufeature.h:49:41: error: 'NBUGINTS' undeclared here (not in a function)
    extern const char * const x86_bug_flags[NBUGINTS*32];
                                            ^~~~~~~~
>> arch/x86/include/asm/cpufeature.h:134:34: warning: 'struct cpuinfo_x86' declared inside parameter list will not be visible outside of this definition or declaration
    extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);
                                     ^~~~~~~~~~~
   In file included from include/linux/compiler_types.h:64:0,
                    from <command-line>:0:
   arch/x86/include/asm/cpufeature.h: In function '_static_cpu_has':
>> arch/x86/include/asm/cpufeature.h:196:24: error: 'X86_FEATURE_ALWAYS' undeclared (first use in this function)
           [always]   "i" (X86_FEATURE_ALWAYS),
                           ^
   include/linux/compiler-gcc.h:182:47: note: in definition of macro 'asm_volatile_goto'
    #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
                                                  ^
   arch/x86/include/asm/cpufeature.h:196:24: note: each undeclared identifier is reported only once for each function it appears in
           [always]   "i" (X86_FEATURE_ALWAYS),
                           ^
   include/linux/compiler-gcc.h:182:47: note: in definition of macro 'asm_volatile_goto'
    #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
                                                  ^
>> arch/x86/include/asm/cpufeature.h:198:52: error: 'struct cpuinfo_um' has no member named 'x86_capability'
           [cap_byte] "m" (((const char *)boot_cpu_data.x86_capability)[bit >> 3])
                                                       ^
   include/linux/compiler-gcc.h:182:47: note: in definition of macro 'asm_volatile_goto'
    #define asm_volatile_goto(x...) do { asm goto(x); asm (""); } while (0)
                                                  ^
   In file included from lib//zinc/curve25519/curve25519-x86_64-glue.h:10:0,
                    from <command-line>:0:
   lib//zinc/curve25519/curve25519-x86_64.h: In function 'inv_eltfp25519_1w_adx':
   lib//zinc/curve25519/curve25519-x86_64.h:1543:2: error: implicit declaration of function 'memzero_explicit' [-Werror=implicit-function-declaration]
     memzero_explicit(&m, sizeof(m));
     ^~~~~~~~~~~~~~~~
   In file included from lib//zinc/curve25519/curve25519-x86_64-glue.h:10:0,
                    from <command-line>:0:
   lib//zinc/curve25519/curve25519-x86_64.h: In function 'curve25519_adx':
   lib//zinc/curve25519/curve25519-x86_64.h:1706:2: error: implicit declaration of function 'memcpy'; did you mean 'pte_copy'? [-Werror=implicit-function-declaration]
     memcpy(m.private, private_key, sizeof(m.private));
     ^~~~~~
     pte_copy
   In file included from <command-line>:0:0:
   lib//zinc/curve25519/curve25519-x86_64-glue.h: At top level:
   lib//zinc/curve25519/curve25519-x86_64-glue.h:12:33: error: expected '=', ',', ';', 'asm' or '__attribute__' before '__ro_after_init'
    static bool curve25519_use_bmi2 __ro_after_init;
                                    ^~~~~~~~~~~~~~~
   lib//zinc/curve25519/curve25519-x86_64-glue.h:13:32: error: expected '=', ',', ';', 'asm' or '__attribute__' before '__ro_after_init'
    static bool curve25519_use_adx __ro_after_init;
                                   ^~~~~~~~~~~~~~~
   lib//zinc/curve25519/curve25519-x86_64-glue.h:15:13: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'curve25519_fpu_init'
    void __init curve25519_fpu_init(void)
                ^~~~~~~~~~~~~~~~~~~
   In file included from <command-line>:0:0:
   lib//zinc/curve25519/curve25519-x86_64-glue.h: In function 'curve25519_arch':
   lib//zinc/curve25519/curve25519-x86_64-glue.h:23:6: error: 'curve25519_use_adx' undeclared (first use in this function); did you mean 'curve25519_adx'?
     if (curve25519_use_adx) {
         ^~~~~~~~~~~~~~~~~~
         curve25519_adx
   lib//zinc/curve25519/curve25519-x86_64-glue.h:26:13: error: 'curve25519_use_bmi2' undeclared (first use in this function); did you mean 'curve25519_use_adx'?
     } else if (curve25519_use_bmi2) {
                ^~~~~~~~~~~~~~~~~~~
                curve25519_use_adx
   lib//zinc/curve25519/curve25519-x86_64-glue.h: In function 'curve25519_base_arch':
   lib//zinc/curve25519/curve25519-x86_64-glue.h:35:6: error: 'curve25519_use_adx' undeclared (first use in this function); did you mean 'curve25519_adx'?
     if (curve25519_use_adx) {
         ^~~~~~~~~~~~~~~~~~
         curve25519_adx
   lib//zinc/curve25519/curve25519-x86_64-glue.h:38:13: error: 'curve25519_use_bmi2' undeclared (first use in this function); did you mean 'curve25519_use_adx'?
     } else if (curve25519_use_bmi2) {
                ^~~~~~~~~~~~~~~~~~~
                curve25519_use_adx
   In file included from arch/x86/include/asm/string.h:5:0,
                    from include/linux/string.h:20,
                    from lib//zinc/curve25519/curve25519.c:9:
   arch/x86/include/asm/string_64.h: At top level:
>> arch/x86/include/asm/string_64.h:32:14: error: conflicting types for 'memcpy'
    extern void *memcpy(void *to, const void *from, size_t len);
                 ^~~~~~
   In file included from lib//zinc/curve25519/curve25519-x86_64-glue.h:10:0,
                    from <command-line>:0:
   lib//zinc/curve25519/curve25519-x86_64.h:1706:2: note: previous implicit declaration of 'memcpy' was here
     memcpy(m.private, private_key, sizeof(m.private));
     ^~~~~~
   In file included from lib//zinc/curve25519/curve25519.c:9:0:
>> include/linux/string.h:216:6: warning: conflicting types for 'memzero_explicit'
    void memzero_explicit(void *s, size_t count);
         ^~~~~~~~~~~~~~~~
   In file included from lib//zinc/curve25519/curve25519-x86_64-glue.h:10:0,
                    from <command-line>:0:
   lib//zinc/curve25519/curve25519-x86_64.h:1543:2: note: previous implicit declaration of 'memzero_explicit' was here
     memzero_explicit(&m, sizeof(m));
     ^~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors
..

vim +/memzero_explicit +1543 lib/zinc/curve25519/curve25519-x86_64.h

468c57c7 Jason A. Donenfeld 2018-08-24  1498  
468c57c7 Jason A. Donenfeld 2018-08-24  1499  static void inv_eltfp25519_1w_adx(u64 *const c, const u64 *const a)
468c57c7 Jason A. Donenfeld 2018-08-24  1500  {
468c57c7 Jason A. Donenfeld 2018-08-24  1501  	struct {
468c57c7 Jason A. Donenfeld 2018-08-24  1502  		eltfp25519_1w_buffer buffer;
468c57c7 Jason A. Donenfeld 2018-08-24  1503  		eltfp25519_1w x0, x1, x2;
468c57c7 Jason A. Donenfeld 2018-08-24  1504  	} __aligned(32) m;
468c57c7 Jason A. Donenfeld 2018-08-24  1505  	u64 *T[4];
468c57c7 Jason A. Donenfeld 2018-08-24  1506  
468c57c7 Jason A. Donenfeld 2018-08-24  1507  	T[0] = m.x0;
468c57c7 Jason A. Donenfeld 2018-08-24  1508  	T[1] = c; /* x^(-1) */
468c57c7 Jason A. Donenfeld 2018-08-24  1509  	T[2] = m.x1;
468c57c7 Jason A. Donenfeld 2018-08-24  1510  	T[3] = m.x2;
468c57c7 Jason A. Donenfeld 2018-08-24  1511  
468c57c7 Jason A. Donenfeld 2018-08-24  1512  	copy_eltfp25519_1w(T[1], a);
468c57c7 Jason A. Donenfeld 2018-08-24  1513  	sqrn_eltfp25519_1w_adx(T[1], 1);
468c57c7 Jason A. Donenfeld 2018-08-24  1514  	copy_eltfp25519_1w(T[2], T[1]);
468c57c7 Jason A. Donenfeld 2018-08-24  1515  	sqrn_eltfp25519_1w_adx(T[2], 2);
468c57c7 Jason A. Donenfeld 2018-08-24  1516  	mul_eltfp25519_1w_adx(T[0], a, T[2]);
468c57c7 Jason A. Donenfeld 2018-08-24  1517  	mul_eltfp25519_1w_adx(T[1], T[1], T[0]);
468c57c7 Jason A. Donenfeld 2018-08-24  1518  	copy_eltfp25519_1w(T[2], T[1]);
468c57c7 Jason A. Donenfeld 2018-08-24  1519  	sqrn_eltfp25519_1w_adx(T[2], 1);
468c57c7 Jason A. Donenfeld 2018-08-24  1520  	mul_eltfp25519_1w_adx(T[0], T[0], T[2]);
468c57c7 Jason A. Donenfeld 2018-08-24  1521  	copy_eltfp25519_1w(T[2], T[0]);
468c57c7 Jason A. Donenfeld 2018-08-24  1522  	sqrn_eltfp25519_1w_adx(T[2], 5);
468c57c7 Jason A. Donenfeld 2018-08-24  1523  	mul_eltfp25519_1w_adx(T[0], T[0], T[2]);
468c57c7 Jason A. Donenfeld 2018-08-24  1524  	copy_eltfp25519_1w(T[2], T[0]);
468c57c7 Jason A. Donenfeld 2018-08-24  1525  	sqrn_eltfp25519_1w_adx(T[2], 10);
468c57c7 Jason A. Donenfeld 2018-08-24  1526  	mul_eltfp25519_1w_adx(T[2], T[2], T[0]);
468c57c7 Jason A. Donenfeld 2018-08-24  1527  	copy_eltfp25519_1w(T[3], T[2]);
468c57c7 Jason A. Donenfeld 2018-08-24  1528  	sqrn_eltfp25519_1w_adx(T[3], 20);
468c57c7 Jason A. Donenfeld 2018-08-24  1529  	mul_eltfp25519_1w_adx(T[3], T[3], T[2]);
468c57c7 Jason A. Donenfeld 2018-08-24  1530  	sqrn_eltfp25519_1w_adx(T[3], 10);
468c57c7 Jason A. Donenfeld 2018-08-24  1531  	mul_eltfp25519_1w_adx(T[3], T[3], T[0]);
468c57c7 Jason A. Donenfeld 2018-08-24  1532  	copy_eltfp25519_1w(T[0], T[3]);
468c57c7 Jason A. Donenfeld 2018-08-24  1533  	sqrn_eltfp25519_1w_adx(T[0], 50);
468c57c7 Jason A. Donenfeld 2018-08-24  1534  	mul_eltfp25519_1w_adx(T[0], T[0], T[3]);
468c57c7 Jason A. Donenfeld 2018-08-24  1535  	copy_eltfp25519_1w(T[2], T[0]);
468c57c7 Jason A. Donenfeld 2018-08-24  1536  	sqrn_eltfp25519_1w_adx(T[2], 100);
468c57c7 Jason A. Donenfeld 2018-08-24  1537  	mul_eltfp25519_1w_adx(T[2], T[2], T[0]);
468c57c7 Jason A. Donenfeld 2018-08-24  1538  	sqrn_eltfp25519_1w_adx(T[2], 50);
468c57c7 Jason A. Donenfeld 2018-08-24  1539  	mul_eltfp25519_1w_adx(T[2], T[2], T[3]);
468c57c7 Jason A. Donenfeld 2018-08-24  1540  	sqrn_eltfp25519_1w_adx(T[2], 5);
468c57c7 Jason A. Donenfeld 2018-08-24  1541  	mul_eltfp25519_1w_adx(T[1], T[1], T[2]);
468c57c7 Jason A. Donenfeld 2018-08-24  1542  
468c57c7 Jason A. Donenfeld 2018-08-24 @1543  	memzero_explicit(&m, sizeof(m));
468c57c7 Jason A. Donenfeld 2018-08-24  1544  }
468c57c7 Jason A. Donenfeld 2018-08-24  1545  
468c57c7 Jason A. Donenfeld 2018-08-24  1546  static void inv_eltfp25519_1w_bmi2(u64 *const c, const u64 *const a)
468c57c7 Jason A. Donenfeld 2018-08-24  1547  {
468c57c7 Jason A. Donenfeld 2018-08-24  1548  	struct {
468c57c7 Jason A. Donenfeld 2018-08-24  1549  		eltfp25519_1w_buffer buffer;
468c57c7 Jason A. Donenfeld 2018-08-24  1550  		eltfp25519_1w x0, x1, x2;
468c57c7 Jason A. Donenfeld 2018-08-24  1551  	} __aligned(32) m;
468c57c7 Jason A. Donenfeld 2018-08-24  1552  	u64 *T[5];
468c57c7 Jason A. Donenfeld 2018-08-24  1553  
468c57c7 Jason A. Donenfeld 2018-08-24  1554  	T[0] = m.x0;
468c57c7 Jason A. Donenfeld 2018-08-24  1555  	T[1] = c; /* x^(-1) */
468c57c7 Jason A. Donenfeld 2018-08-24  1556  	T[2] = m.x1;
468c57c7 Jason A. Donenfeld 2018-08-24  1557  	T[3] = m.x2;
468c57c7 Jason A. Donenfeld 2018-08-24  1558  
468c57c7 Jason A. Donenfeld 2018-08-24  1559  	copy_eltfp25519_1w(T[1], a);
468c57c7 Jason A. Donenfeld 2018-08-24  1560  	sqrn_eltfp25519_1w_bmi2(T[1], 1);
468c57c7 Jason A. Donenfeld 2018-08-24  1561  	copy_eltfp25519_1w(T[2], T[1]);
468c57c7 Jason A. Donenfeld 2018-08-24  1562  	sqrn_eltfp25519_1w_bmi2(T[2], 2);
468c57c7 Jason A. Donenfeld 2018-08-24  1563  	mul_eltfp25519_1w_bmi2(T[0], a, T[2]);
468c57c7 Jason A. Donenfeld 2018-08-24  1564  	mul_eltfp25519_1w_bmi2(T[1], T[1], T[0]);
468c57c7 Jason A. Donenfeld 2018-08-24  1565  	copy_eltfp25519_1w(T[2], T[1]);
468c57c7 Jason A. Donenfeld 2018-08-24  1566  	sqrn_eltfp25519_1w_bmi2(T[2], 1);
468c57c7 Jason A. Donenfeld 2018-08-24  1567  	mul_eltfp25519_1w_bmi2(T[0], T[0], T[2]);
468c57c7 Jason A. Donenfeld 2018-08-24  1568  	copy_eltfp25519_1w(T[2], T[0]);
468c57c7 Jason A. Donenfeld 2018-08-24  1569  	sqrn_eltfp25519_1w_bmi2(T[2], 5);
468c57c7 Jason A. Donenfeld 2018-08-24  1570  	mul_eltfp25519_1w_bmi2(T[0], T[0], T[2]);
468c57c7 Jason A. Donenfeld 2018-08-24  1571  	copy_eltfp25519_1w(T[2], T[0]);
468c57c7 Jason A. Donenfeld 2018-08-24  1572  	sqrn_eltfp25519_1w_bmi2(T[2], 10);
468c57c7 Jason A. Donenfeld 2018-08-24  1573  	mul_eltfp25519_1w_bmi2(T[2], T[2], T[0]);
468c57c7 Jason A. Donenfeld 2018-08-24  1574  	copy_eltfp25519_1w(T[3], T[2]);
468c57c7 Jason A. Donenfeld 2018-08-24  1575  	sqrn_eltfp25519_1w_bmi2(T[3], 20);
468c57c7 Jason A. Donenfeld 2018-08-24  1576  	mul_eltfp25519_1w_bmi2(T[3], T[3], T[2]);
468c57c7 Jason A. Donenfeld 2018-08-24  1577  	sqrn_eltfp25519_1w_bmi2(T[3], 10);
468c57c7 Jason A. Donenfeld 2018-08-24  1578  	mul_eltfp25519_1w_bmi2(T[3], T[3], T[0]);
468c57c7 Jason A. Donenfeld 2018-08-24  1579  	copy_eltfp25519_1w(T[0], T[3]);
468c57c7 Jason A. Donenfeld 2018-08-24  1580  	sqrn_eltfp25519_1w_bmi2(T[0], 50);
468c57c7 Jason A. Donenfeld 2018-08-24  1581  	mul_eltfp25519_1w_bmi2(T[0], T[0], T[3]);
468c57c7 Jason A. Donenfeld 2018-08-24  1582  	copy_eltfp25519_1w(T[2], T[0]);
468c57c7 Jason A. Donenfeld 2018-08-24  1583  	sqrn_eltfp25519_1w_bmi2(T[2], 100);
468c57c7 Jason A. Donenfeld 2018-08-24  1584  	mul_eltfp25519_1w_bmi2(T[2], T[2], T[0]);
468c57c7 Jason A. Donenfeld 2018-08-24  1585  	sqrn_eltfp25519_1w_bmi2(T[2], 50);
468c57c7 Jason A. Donenfeld 2018-08-24  1586  	mul_eltfp25519_1w_bmi2(T[2], T[2], T[3]);
468c57c7 Jason A. Donenfeld 2018-08-24  1587  	sqrn_eltfp25519_1w_bmi2(T[2], 5);
468c57c7 Jason A. Donenfeld 2018-08-24  1588  	mul_eltfp25519_1w_bmi2(T[1], T[1], T[2]);
468c57c7 Jason A. Donenfeld 2018-08-24  1589  
468c57c7 Jason A. Donenfeld 2018-08-24  1590  	memzero_explicit(&m, sizeof(m));
468c57c7 Jason A. Donenfeld 2018-08-24  1591  }
468c57c7 Jason A. Donenfeld 2018-08-24  1592  
468c57c7 Jason A. Donenfeld 2018-08-24  1593  /* Given c, a 256-bit number, fred_eltfp25519_1w updates c
468c57c7 Jason A. Donenfeld 2018-08-24  1594   * with a number such that 0 <= C < 2**255-19.
468c57c7 Jason A. Donenfeld 2018-08-24  1595   */
468c57c7 Jason A. Donenfeld 2018-08-24  1596  static __always_inline void fred_eltfp25519_1w(u64 *const c)
468c57c7 Jason A. Donenfeld 2018-08-24  1597  {
468c57c7 Jason A. Donenfeld 2018-08-24  1598  	u64 tmp0 = 38, tmp1 = 19;
468c57c7 Jason A. Donenfeld 2018-08-24  1599  	asm volatile(
468c57c7 Jason A. Donenfeld 2018-08-24  1600  		"btrq   $63,    %3 ;" /* Put bit 255 in carry flag and clear */
468c57c7 Jason A. Donenfeld 2018-08-24  1601  		"cmovncl %k5,   %k4 ;" /* c[255] ? 38 : 19 */
468c57c7 Jason A. Donenfeld 2018-08-24  1602  
468c57c7 Jason A. Donenfeld 2018-08-24  1603  		/* Add either 19 or 38 to c */
468c57c7 Jason A. Donenfeld 2018-08-24  1604  		"addq    %4,   %0 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1605  		"adcq    $0,   %1 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1606  		"adcq    $0,   %2 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1607  		"adcq    $0,   %3 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1608  
468c57c7 Jason A. Donenfeld 2018-08-24  1609  		/* Test for bit 255 again; only triggered on overflow modulo 2^255-19 */
468c57c7 Jason A. Donenfeld 2018-08-24  1610  		"movl    $0,  %k4 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1611  		"cmovnsl %k5,  %k4 ;" /* c[255] ? 0 : 19 */
468c57c7 Jason A. Donenfeld 2018-08-24  1612  		"btrq   $63,   %3 ;" /* Clear bit 255 */
468c57c7 Jason A. Donenfeld 2018-08-24  1613  
468c57c7 Jason A. Donenfeld 2018-08-24  1614  		/* Subtract 19 if necessary */
468c57c7 Jason A. Donenfeld 2018-08-24  1615  		"subq    %4,   %0 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1616  		"sbbq    $0,   %1 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1617  		"sbbq    $0,   %2 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1618  		"sbbq    $0,   %3 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1619  
468c57c7 Jason A. Donenfeld 2018-08-24  1620  		: "+r"(c[0]), "+r"(c[1]), "+r"(c[2]), "+r"(c[3]), "+r"(tmp0), "+r"(tmp1)
468c57c7 Jason A. Donenfeld 2018-08-24  1621  		:
468c57c7 Jason A. Donenfeld 2018-08-24  1622  		: "memory", "cc");
468c57c7 Jason A. Donenfeld 2018-08-24  1623  }
468c57c7 Jason A. Donenfeld 2018-08-24  1624  
468c57c7 Jason A. Donenfeld 2018-08-24  1625  static __always_inline void cswap(u8 bit, u64 *const px, u64 *const py)
468c57c7 Jason A. Donenfeld 2018-08-24  1626  {
468c57c7 Jason A. Donenfeld 2018-08-24  1627  	u64 temp;
468c57c7 Jason A. Donenfeld 2018-08-24  1628  	asm volatile(
468c57c7 Jason A. Donenfeld 2018-08-24  1629  		"test %9, %9 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1630  		"movq %0, %8 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1631  		"cmovnzq %4, %0 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1632  		"cmovnzq %8, %4 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1633  		"movq %1, %8 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1634  		"cmovnzq %5, %1 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1635  		"cmovnzq %8, %5 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1636  		"movq %2, %8 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1637  		"cmovnzq %6, %2 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1638  		"cmovnzq %8, %6 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1639  		"movq %3, %8 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1640  		"cmovnzq %7, %3 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1641  		"cmovnzq %8, %7 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1642  		: "+r"(px[0]), "+r"(px[1]), "+r"(px[2]), "+r"(px[3]),
468c57c7 Jason A. Donenfeld 2018-08-24  1643  		  "+r"(py[0]), "+r"(py[1]), "+r"(py[2]), "+r"(py[3]),
468c57c7 Jason A. Donenfeld 2018-08-24  1644  		  "=r"(temp)
468c57c7 Jason A. Donenfeld 2018-08-24  1645  		: "r"(bit)
468c57c7 Jason A. Donenfeld 2018-08-24  1646  		: "cc"
468c57c7 Jason A. Donenfeld 2018-08-24  1647  	);
468c57c7 Jason A. Donenfeld 2018-08-24  1648  }
468c57c7 Jason A. Donenfeld 2018-08-24  1649  
468c57c7 Jason A. Donenfeld 2018-08-24  1650  static __always_inline void cselect(u8 bit, u64 *const px, const u64 *const py)
468c57c7 Jason A. Donenfeld 2018-08-24  1651  {
468c57c7 Jason A. Donenfeld 2018-08-24  1652  	asm volatile(
468c57c7 Jason A. Donenfeld 2018-08-24  1653  		"test %4, %4 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1654  		"cmovnzq %5, %0 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1655  		"cmovnzq %6, %1 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1656  		"cmovnzq %7, %2 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1657  		"cmovnzq %8, %3 ;"
468c57c7 Jason A. Donenfeld 2018-08-24  1658  		: "+r"(px[0]), "+r"(px[1]), "+r"(px[2]), "+r"(px[3])
468c57c7 Jason A. Donenfeld 2018-08-24  1659  		: "r"(bit), "rm"(py[0]), "rm"(py[1]), "rm"(py[2]), "rm"(py[3])
468c57c7 Jason A. Donenfeld 2018-08-24  1660  		: "cc"
468c57c7 Jason A. Donenfeld 2018-08-24  1661  	);
468c57c7 Jason A. Donenfeld 2018-08-24  1662  }
468c57c7 Jason A. Donenfeld 2018-08-24  1663  
468c57c7 Jason A. Donenfeld 2018-08-24  1664  static __always_inline void clamp_secret(u8 secret[CURVE25519_POINT_SIZE])
468c57c7 Jason A. Donenfeld 2018-08-24  1665  {
468c57c7 Jason A. Donenfeld 2018-08-24  1666  	secret[0] &= 248;
468c57c7 Jason A. Donenfeld 2018-08-24  1667  	secret[31] &= 127;
468c57c7 Jason A. Donenfeld 2018-08-24  1668  	secret[31] |= 64;
468c57c7 Jason A. Donenfeld 2018-08-24  1669  }
468c57c7 Jason A. Donenfeld 2018-08-24  1670  
468c57c7 Jason A. Donenfeld 2018-08-24  1671  static void curve25519_adx(u8 shared[CURVE25519_POINT_SIZE], const u8 private_key[CURVE25519_POINT_SIZE], const u8 session_key[CURVE25519_POINT_SIZE])
468c57c7 Jason A. Donenfeld 2018-08-24  1672  {
468c57c7 Jason A. Donenfeld 2018-08-24  1673  	struct {
468c57c7 Jason A. Donenfeld 2018-08-24  1674  		u64 buffer[4 * NUM_WORDS_ELTFP25519];
468c57c7 Jason A. Donenfeld 2018-08-24  1675  		u64 coordinates[4 * NUM_WORDS_ELTFP25519];
468c57c7 Jason A. Donenfeld 2018-08-24  1676  		u64 workspace[6 * NUM_WORDS_ELTFP25519];
468c57c7 Jason A. Donenfeld 2018-08-24  1677  		u8 session[CURVE25519_POINT_SIZE];
468c57c7 Jason A. Donenfeld 2018-08-24  1678  		u8 private[CURVE25519_POINT_SIZE];
468c57c7 Jason A. Donenfeld 2018-08-24  1679  	} __aligned(32) m;
468c57c7 Jason A. Donenfeld 2018-08-24  1680  
468c57c7 Jason A. Donenfeld 2018-08-24  1681  	int i = 0, j = 0;
468c57c7 Jason A. Donenfeld 2018-08-24  1682  	u64 prev = 0;
468c57c7 Jason A. Donenfeld 2018-08-24  1683  	u64 *const X1 = (u64 *)m.session;
468c57c7 Jason A. Donenfeld 2018-08-24  1684  	u64 *const key = (u64 *)m.private;
468c57c7 Jason A. Donenfeld 2018-08-24  1685  	u64 *const Px = m.coordinates + 0;
468c57c7 Jason A. Donenfeld 2018-08-24  1686  	u64 *const Pz = m.coordinates + 4;
468c57c7 Jason A. Donenfeld 2018-08-24  1687  	u64 *const Qx = m.coordinates + 8;
468c57c7 Jason A. Donenfeld 2018-08-24  1688  	u64 *const Qz = m.coordinates + 12;
468c57c7 Jason A. Donenfeld 2018-08-24  1689  	u64 *const X2 = Qx;
468c57c7 Jason A. Donenfeld 2018-08-24  1690  	u64 *const Z2 = Qz;
468c57c7 Jason A. Donenfeld 2018-08-24  1691  	u64 *const X3 = Px;
468c57c7 Jason A. Donenfeld 2018-08-24  1692  	u64 *const Z3 = Pz;
468c57c7 Jason A. Donenfeld 2018-08-24  1693  	u64 *const X2Z2 = Qx;
468c57c7 Jason A. Donenfeld 2018-08-24  1694  	u64 *const X3Z3 = Px;
468c57c7 Jason A. Donenfeld 2018-08-24  1695  
468c57c7 Jason A. Donenfeld 2018-08-24  1696  	u64 *const A = m.workspace + 0;
468c57c7 Jason A. Donenfeld 2018-08-24  1697  	u64 *const B = m.workspace + 4;
468c57c7 Jason A. Donenfeld 2018-08-24  1698  	u64 *const D = m.workspace + 8;
468c57c7 Jason A. Donenfeld 2018-08-24  1699  	u64 *const C = m.workspace + 12;
468c57c7 Jason A. Donenfeld 2018-08-24  1700  	u64 *const DA = m.workspace + 16;
468c57c7 Jason A. Donenfeld 2018-08-24  1701  	u64 *const CB = m.workspace + 20;
468c57c7 Jason A. Donenfeld 2018-08-24  1702  	u64 *const AB = A;
468c57c7 Jason A. Donenfeld 2018-08-24  1703  	u64 *const DC = D;
468c57c7 Jason A. Donenfeld 2018-08-24  1704  	u64 *const DACB = DA;
468c57c7 Jason A. Donenfeld 2018-08-24  1705  
468c57c7 Jason A. Donenfeld 2018-08-24 @1706  	memcpy(m.private, private_key, sizeof(m.private));
468c57c7 Jason A. Donenfeld 2018-08-24  1707  	memcpy(m.session, session_key, sizeof(m.session));
468c57c7 Jason A. Donenfeld 2018-08-24  1708  
468c57c7 Jason A. Donenfeld 2018-08-24  1709  	clamp_secret(m.private);
468c57c7 Jason A. Donenfeld 2018-08-24  1710  
468c57c7 Jason A. Donenfeld 2018-08-24  1711  	/* As in the draft:
468c57c7 Jason A. Donenfeld 2018-08-24  1712  	 * When receiving such an array, implementations of curve25519
468c57c7 Jason A. Donenfeld 2018-08-24  1713  	 * MUST mask the most-significant bit in the final byte. This
468c57c7 Jason A. Donenfeld 2018-08-24  1714  	 * is done to preserve compatibility with point formats which
468c57c7 Jason A. Donenfeld 2018-08-24  1715  	 * reserve the sign bit for use in other protocols and to
468c57c7 Jason A. Donenfeld 2018-08-24  1716  	 * increase resistance to implementation fingerprinting
468c57c7 Jason A. Donenfeld 2018-08-24  1717  	 */
468c57c7 Jason A. Donenfeld 2018-08-24  1718  	m.session[CURVE25519_POINT_SIZE - 1] &= (1 << (255 % 8)) - 1;
468c57c7 Jason A. Donenfeld 2018-08-24  1719  
468c57c7 Jason A. Donenfeld 2018-08-24  1720  	copy_eltfp25519_1w(Px, X1);
468c57c7 Jason A. Donenfeld 2018-08-24  1721  	setzero_eltfp25519_1w(Pz);
468c57c7 Jason A. Donenfeld 2018-08-24  1722  	setzero_eltfp25519_1w(Qx);
468c57c7 Jason A. Donenfeld 2018-08-24  1723  	setzero_eltfp25519_1w(Qz);
468c57c7 Jason A. Donenfeld 2018-08-24  1724  
468c57c7 Jason A. Donenfeld 2018-08-24  1725  	Pz[0] = 1;
468c57c7 Jason A. Donenfeld 2018-08-24  1726  	Qx[0] = 1;
468c57c7 Jason A. Donenfeld 2018-08-24  1727  
468c57c7 Jason A. Donenfeld 2018-08-24  1728  	/* main-loop */
468c57c7 Jason A. Donenfeld 2018-08-24  1729  	prev = 0;
468c57c7 Jason A. Donenfeld 2018-08-24  1730  	j = 62;
468c57c7 Jason A. Donenfeld 2018-08-24  1731  	for (i = 3; i >= 0; --i) {
468c57c7 Jason A. Donenfeld 2018-08-24  1732  		while (j >= 0) {
468c57c7 Jason A. Donenfeld 2018-08-24  1733  			u64 bit = (key[i] >> j) & 0x1;
468c57c7 Jason A. Donenfeld 2018-08-24  1734  			u64 swap = bit ^ prev;
468c57c7 Jason A. Donenfeld 2018-08-24  1735  			prev = bit;
468c57c7 Jason A. Donenfeld 2018-08-24  1736  
468c57c7 Jason A. Donenfeld 2018-08-24  1737  			add_eltfp25519_1w_adx(A, X2, Z2);	/* A = (X2+Z2) */
468c57c7 Jason A. Donenfeld 2018-08-24  1738  			sub_eltfp25519_1w(B, X2, Z2);		/* B = (X2-Z2) */
468c57c7 Jason A. Donenfeld 2018-08-24  1739  			add_eltfp25519_1w_adx(C, X3, Z3);	/* C = (X3+Z3) */
468c57c7 Jason A. Donenfeld 2018-08-24  1740  			sub_eltfp25519_1w(D, X3, Z3);		/* D = (X3-Z3) */
468c57c7 Jason A. Donenfeld 2018-08-24  1741  			mul_eltfp25519_2w_adx(DACB, AB, DC);	/* [DA|CB] = [A|B]*[D|C] */
468c57c7 Jason A. Donenfeld 2018-08-24  1742  
468c57c7 Jason A. Donenfeld 2018-08-24  1743  			cselect(swap, A, C);
468c57c7 Jason A. Donenfeld 2018-08-24  1744  			cselect(swap, B, D);
468c57c7 Jason A. Donenfeld 2018-08-24  1745  
468c57c7 Jason A. Donenfeld 2018-08-24  1746  			sqr_eltfp25519_2w_adx(AB);		/* [AA|BB] = [A^2|B^2] */
468c57c7 Jason A. Donenfeld 2018-08-24  1747  			add_eltfp25519_1w_adx(X3, DA, CB);	/* X3 = (DA+CB) */
468c57c7 Jason A. Donenfeld 2018-08-24  1748  			sub_eltfp25519_1w(Z3, DA, CB);		/* Z3 = (DA-CB) */
468c57c7 Jason A. Donenfeld 2018-08-24  1749  			sqr_eltfp25519_2w_adx(X3Z3);		/* [X3|Z3] = [(DA+CB)|(DA+CB)]^2 */
468c57c7 Jason A. Donenfeld 2018-08-24  1750  
468c57c7 Jason A. Donenfeld 2018-08-24  1751  			copy_eltfp25519_1w(X2, B);		/* X2 = B^2 */
468c57c7 Jason A. Donenfeld 2018-08-24  1752  			sub_eltfp25519_1w(Z2, A, B);		/* Z2 = E = AA-BB */
468c57c7 Jason A. Donenfeld 2018-08-24  1753  
468c57c7 Jason A. Donenfeld 2018-08-24  1754  			mul_a24_eltfp25519_1w(B, Z2);		/* B = a24*E */
468c57c7 Jason A. Donenfeld 2018-08-24  1755  			add_eltfp25519_1w_adx(B, B, X2);	/* B = a24*E+B */
468c57c7 Jason A. Donenfeld 2018-08-24  1756  			mul_eltfp25519_2w_adx(X2Z2, X2Z2, AB);	/* [X2|Z2] = [B|E]*[A|a24*E+B] */
468c57c7 Jason A. Donenfeld 2018-08-24  1757  			mul_eltfp25519_1w_adx(Z3, Z3, X1);	/* Z3 = Z3*X1 */
468c57c7 Jason A. Donenfeld 2018-08-24  1758  			--j;
468c57c7 Jason A. Donenfeld 2018-08-24  1759  		}
468c57c7 Jason A. Donenfeld 2018-08-24  1760  		j = 63;
468c57c7 Jason A. Donenfeld 2018-08-24  1761  	}
468c57c7 Jason A. Donenfeld 2018-08-24  1762  
468c57c7 Jason A. Donenfeld 2018-08-24  1763  	inv_eltfp25519_1w_adx(A, Qz);
468c57c7 Jason A. Donenfeld 2018-08-24  1764  	mul_eltfp25519_1w_adx((u64 *)shared, Qx, A);
468c57c7 Jason A. Donenfeld 2018-08-24  1765  	fred_eltfp25519_1w((u64 *)shared);
468c57c7 Jason A. Donenfeld 2018-08-24  1766  
468c57c7 Jason A. Donenfeld 2018-08-24  1767  	memzero_explicit(&m, sizeof(m));
468c57c7 Jason A. Donenfeld 2018-08-24  1768  }
468c57c7 Jason A. Donenfeld 2018-08-24  1769  

:::::: The code at line 1543 was first introduced by commit
:::::: 468c57c74ac7091c9c04ab2acccf68fe300cd9bc zinc: Curve25519 x86_64 implementation

:::::: TO: Jason A. Donenfeld <Jason@zx2c4.com>
:::::: CC: 0day robot <lkp@intel.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 20207 bytes --]

^ permalink raw reply

* any reason for "!!netif_carrier_ok" and "!!netif_dormant" in net-sysfs.c?
From: Robert P. J. Day @ 2018-08-27  8:55 UTC (permalink / raw)
  To: Linux kernel netdev mailing list


  another pedantic oddity -- is there a reason for these two double
negations in net/core/net-sysfs.c?

static ssize_t carrier_show(struct device *dev,
                            struct device_attribute *attr, char *buf)
{
        struct net_device *netdev = to_net_dev(dev);

        if (netif_running(netdev))
                return sprintf(buf, fmt_dec, !!netif_carrier_ok(netdev));

...

static ssize_t dormant_show(struct device *dev,
                            struct device_attribute *attr, char *buf)
{
        struct net_device *netdev = to_net_dev(dev);

        if (netif_running(netdev))
                return sprintf(buf, fmt_dec, !!netif_dormant(netdev));

  i understand the normal rationale for !! in assuring a final boolean
value of precisely either 0 or 1 (here for the sake of printing), but
given that those two routines are declared with a return value of
"bool" in netdevice.h, i don't see any way that they can return
anything *other* than 0 or 1. i realize it can't possibly hurt, but
whenever i see this construct, i normally assume there's a *reason*
for it, but i can't see what it's doing in those two places.

rday

-- 

========================================================================
Robert P. J. Day                                 Ottawa, Ontario, CANADA
                  http://crashcourse.ca/dokuwiki

Twitter:                                       http://twitter.com/rpjday
LinkedIn:                               http://ca.linkedin.com/in/rpjday
========================================================================

^ permalink raw reply

* Re: [PATCH 3/3] dt-bindings: can: rcar_can: Add r8a774a1 support
From: Geert Uytterhoeven @ 2018-08-27 12:40 UTC (permalink / raw)
  To: Fabrizio Castro
  Cc: Wolfgang Grandegger, Marc Kleine-Budde, Rob Herring, Mark Rutland,
	Sergei Shtylyov, David S. Miller, linux-can, netdev,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Simon Horman, Geert Uytterhoeven, Chris Paterson, Biju Das,
	Linux-Renesas, Linux Kernel Mailing List
In-Reply-To: <1535029653-7418-4-git-send-email-fabrizio.castro@bp.renesas.com>

Hi Fabrizio,

On Thu, Aug 23, 2018 at 3:08 PM Fabrizio Castro
<fabrizio.castro@bp.renesas.com> wrote:>
> Document RZ/G2M (r8a774a1) SoC specific bindings and RZ/G2
> generic bindings.
>
> Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com>
> Signed-off-by: Chris Paterson <Chris.Paterson2@renesas.com>
> Reviewed-by: Biju Das <biju.das@bp.renesas.com>

Thanks for your patch!

> --- a/Documentation/devicetree/bindings/net/can/rcar_can.txt
> +++ b/Documentation/devicetree/bindings/net/can/rcar_can.txt
> @@ -4,6 +4,7 @@ Renesas R-Car CAN controller Device Tree Bindings
>  Required properties:
>  - compatible: "renesas,can-r8a7743" if CAN controller is a part of R8A7743 SoC.
>               "renesas,can-r8a7745" if CAN controller is a part of R8A7745 SoC.
> +             "renesas,can-r8a774a1" if CAN controller is a part of R8A774A1 SoC.

Looks good to me.

>               "renesas,can-r8a7778" if CAN controller is a part of R8A7778 SoC.
>               "renesas,can-r8a7779" if CAN controller is a part of R8A7779 SoC.
>               "renesas,can-r8a7790" if CAN controller is a part of R8A7790 SoC.
> @@ -17,6 +18,7 @@ Required properties:
>               "renesas,rcar-gen2-can" for a generic R-Car Gen2 or RZ/G1
>               compatible device.
>               "renesas,rcar-gen3-can" for a generic R-Car Gen3 compatible device.
> +             "renesas,rzg-gen2-can" for a generic RZ/G2 compatible device.

AFAIK, the actual CAN module in RZ/G2M is fully compatible with the CAN
module in R-Car Gen3 SoCs. The lack of clkp2 is merely an integration
difference: as RZ/G2 SoCs do not have the CANFD module, and their CPG block
doesn't provide the CANFD clock (so the CAN device node in DT cannot refer
to that clock anyway).

Hence I don't think there's a need to introduce a "renesas,rzg-gen2-can"
compatible value.

>               When compatible with the generic version, nodes must list the
>               SoC-specific version corresponding to the platform first
>               followed by the generic version.
> @@ -24,7 +26,9 @@ Required properties:
>  - reg: physical base address and size of the R-Car CAN register map.
>  - interrupts: interrupt specifier for the sole interrupt.
>  - clocks: phandles and clock specifiers for 3 CAN clock inputs.

You still have "3" here. Perhaps
"Must contain a phandle and clock-specifier pair for each entry in
clock-names."?

> -- clock-names: 3 clock input name strings: "clkp1", "clkp2", "can_clk".
> +- clock-names: 2 clock input name strings for RZ/G2: "clkp1", "can_clk".
> +              3 clock input name strings for every other SoC: "clkp1", "clkp2",
> +              "can_clk".

OK.

> @@ -41,8 +45,9 @@ using the below properties:
>  Optional properties:
>  - renesas,can-clock-select: R-Car CAN Clock Source Select. Valid values are:
>                             <0x0> (default) : Peripheral clock (clkp1)
> -                           <0x1> : Peripheral clock (clkp2)
> -                           <0x3> : Externally input clock
> +                           <0x1> : Peripheral clock (clkp2) (not supported by
> +                                   RZ/G2 devices)
> +                           <0x3> : External input clock

I already expressed my feelings about this property in my reply to the first
patch ;-)

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH] net: sched: Fix memory exposure from short TCA_U32_SEL
From: Jamal Hadi Salim @ 2018-08-27 11:57 UTC (permalink / raw)
  To: Al Viro, Kees Cook
  Cc: LKML, Cong Wang, Jiri Pirko, David S. Miller, Network Development
In-Reply-To: <20180826225749.GY6515@ZenIV.linux.org.uk>

On 2018-08-26 6:57 p.m., Al Viro wrote:
> On Sun, Aug 26, 2018 at 06:32:37PM +0100, Al Viro wrote:
> 
>> As far as I can tell, the solution is
> [snip long and painful reasoning]
>> pointers, and not in provably opaque fashion.  Theoretically, the three tcf_...
>> inlines above need another look; fortunately, they don't use ->next at all, not to
>> mention not being used anywhere outside of net/sched/*.c
>>
>> 	The 80 lines above prove that we only need to grep net/sched/*.c for
>> tcf_proto_ops method calls.  And only because we don't have (thank $DEITY)
>> anything that could deconstruct types - as soon as some bastard grows means
>> to say "type of the second argument of the function pointed to by p", this
>> kind of analysis, painful as it is, goes out of window.  Even as it is,
>> do you really like the idea of newbies trying to get through the exercises
>> like the one above?
> 
> BTW, would there be any problem if we took the definitions of tcf_proto and
> tcf_proto_ops to e.g. net/sched/tcf_proto.h (along with the three inlines in
> in pkt_cls.h), left forwards in sch_generic.h and added includes of "tcf_proto.h"
> where needed in net/sched/*.c?
> 

I cant think of any challenges. Cong/Jiri? Would it require development
time classifiers/actions/qdiscs to sit in that directory (I suspect you
dont want them in include/net).
BTW, the idea of improving grep-ability of the code by prefixing the
ops appropriately makes sense. i.e we should have ops->cls_init,
ops->act_init etc.

cheers,
jamal

> That would make tcf_proto/tcf_proto_ops opaque outside of net/sched, reducing
> the exposure of internals.  Something like a diff below (against net/master,
> builds clean, ought to result in identical binary):
> 

^ permalink raw reply

* Re: confusing comment, explanation of @IFF_RUNNING in if.h
From: Robert P. J. Day @ 2018-08-27  8:04 UTC (permalink / raw)
  To: Oliver Hartkopp
  Cc: Stephen Hemminger, Andrew Lunn, Linux kernel netdev mailing list
In-Reply-To: <5c3004dc-e877-c142-60ac-91f4623e5153@hartkopp.net>

On Mon, 27 Aug 2018, Oliver Hartkopp wrote:

> "released upon production" means usually: Oh, we put that driver in
> a tar-ball on a CD that's shipped with the product and which will
> get no further visibility nor (security) maintenance.
>
> Robert, please tell your manager that creating a driver is no rocket
> science and also brings no "costumer differentiation" which needs to
> be covered under NDA.
>
> Posting drivers and bring it into mainline Linux heavily increases
> the quality due to the review process and all the people that are
> willing to help you to get better. At the end your driver gets
> long-term maintenance and other people can benefit from it - as your
> boss is getting benefit from using Linux right now.
>
> When something is "released upon production" it will not be in a
> quality that it could go into the kernel - and no one will have the
> time/money/ambition to spend effort on it then. You have just
> produced one of the numerous dead out-of-tree drivers. That would be
> just sad.

  i make these arguments on a regular basis with all of my clients
but, as a contractor, i have little influence. but i will continue to
make them.

rday

-- 

========================================================================
Robert P. J. Day                                 Ottawa, Ontario, CANADA
                  http://crashcourse.ca/dokuwiki

Twitter:                                       http://twitter.com/rpjday
LinkedIn:                               http://ca.linkedin.com/in/rpjday
========================================================================

^ permalink raw reply

* Re: [PATCH v2 17/17] net: WireGuard secure network tunnel
From: kbuild test robot @ 2018-08-27 11:13 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: kbuild-all, linux-kernel, netdev, davem, Jason A. Donenfeld,
	Greg KH
In-Reply-To: <20180824213849.23647-18-Jason@zx2c4.com>

[-- Attachment #1: Type: text/plain, Size: 2613 bytes --]

Hi Jason,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.19-rc1 next-20180827]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jason-A-Donenfeld/WireGuard-Secure-Network-Tunnel/20180827-073051
config: arm-raumfeld_defconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.2.0 make.cross ARCH=arm 

All errors (new ones prefixed by >>):

   In file included from <command-line>:0:0:
   lib/zinc/chacha20/chacha20-arm-glue.h: In function 'chacha20_arch':
>> lib/zinc/chacha20/chacha20-arm-glue.h:36:3: error: implicit declaration of function 'chacha20_neon'; did you mean 'chacha20_arch'? [-Werror=implicit-function-declaration]
      chacha20_neon(dst, src, len, key, counter);
      ^~~~~~~~~~~~~
      chacha20_arch
   cc1: some warnings being treated as errors

vim +36 lib/zinc/chacha20/chacha20-arm-glue.h

959d9378 Jason A. Donenfeld 2018-08-24  26  
959d9378 Jason A. Donenfeld 2018-08-24  27  static inline bool chacha20_arch(u8 *dst, const u8 *src, const size_t len, const u32 key[8], const u32 counter[4], simd_context_t simd_context)
959d9378 Jason A. Donenfeld 2018-08-24  28  {
959d9378 Jason A. Donenfeld 2018-08-24  29  	if (simd_context != HAVE_FULL_SIMD
959d9378 Jason A. Donenfeld 2018-08-24  30  #if defined(ARM_USE_NEON)
959d9378 Jason A. Donenfeld 2018-08-24  31  		|| !chacha20_use_neon
959d9378 Jason A. Donenfeld 2018-08-24  32  #endif
959d9378 Jason A. Donenfeld 2018-08-24  33  	)
959d9378 Jason A. Donenfeld 2018-08-24  34  		chacha20_arm(dst, src, len, key, counter);
959d9378 Jason A. Donenfeld 2018-08-24  35  	else
959d9378 Jason A. Donenfeld 2018-08-24 @36  		chacha20_neon(dst, src, len, key, counter);
959d9378 Jason A. Donenfeld 2018-08-24  37  	return true;
959d9378 Jason A. Donenfeld 2018-08-24  38  }
959d9378 Jason A. Donenfeld 2018-08-24  39  

:::::: The code at line 36 was first introduced by commit
:::::: 959d93782f7ebf927cceeeb5ba86331211abfcd8 zinc: ChaCha20 ARM and ARM64 implementations

:::::: TO: Jason A. Donenfeld <Jason@zx2c4.com>
:::::: CC: 0day robot <lkp@intel.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 21645 bytes --]

^ permalink raw reply

* Re: confusing comment, explanation of @IFF_RUNNING in if.h
From: Oliver Hartkopp @ 2018-08-27  7:11 UTC (permalink / raw)
  To: Robert P. J. Day, Stephen Hemminger
  Cc: Andrew Lunn, Linux kernel netdev mailing list
In-Reply-To: <alpine.LFD.2.21.1808270220020.11581@localhost.localdomain>



On 08/27/2018 08:20 AM, Robert P. J. Day wrote:
> On Sun, 26 Aug 2018, Stephen Hemminger wrote:
> 
>> On Sun, 26 Aug 2018 15:20:24 -0400 (EDT)
>> "Robert P. J. Day" <rpjday@crashcourse.ca> wrote:
>>
>>> On Sun, 26 Aug 2018, Andrew Lunn wrote:
>>>
>>>>>    i ask since, in my testing, when the interface should have been
>>>>> up, the attribute file "operstate" for that interface showed
>>>>> "unknown", and i wondered how worried i should be about that.
>>>>
>>>> Hi Robert
>>>>
>>>> You should probably post the driver for review. A well written
>>>> driver should not even need to care about any of this. phylib and
>>>> the netdev driver code does all the work. It only gets interesting
>>>> when you don't have a PHY, e.g. a stacked device, like bonding, or a
>>>> virtual device like tun/tap.
>>>
>>>    i wish, but i'm on contract, and proprietary, and NDA and all that.
>>> so i am reduced to crawling through the code, trying to figure out
>>> what is misconfigured that is causing all this grief.
>>>
>>> rday
>>>
>>
>> So you expect FOSS developers to help you with proprietary licensed
>> driver. Good Luck with that.
> 
>    sorry, i'm sure this will all be released upon production, just not
> while it's in the midst of development.

"released upon production" means usually:
Oh, we put that driver in a tar-ball on a CD that's shipped with the 
product and which will get no further visibility nor (security) maintenance.

Robert, please tell your manager that creating a driver is no rocket 
science and also brings no "costumer differentiation" which needs to be 
covered under NDA.

Posting drivers and bring it into mainline Linux heavily increases the 
quality due to the review process and all the people that are willing to 
help you to get better. At the end your driver gets long-term 
maintenance and other people can benefit from it - as your boss is 
getting benefit from using Linux right now.

When something is "released upon production" it will not be in a quality 
that it could go into the kernel - and no one will have the 
time/money/ambition to spend effort on it then. You have just produced 
one of the numerous dead out-of-tree drivers. That would be just sad.

Best regards,
Oliver

^ permalink raw reply

* Re: [PATCH] r8169: don't use MSI-X on RTL8106e
From: Jian-Hong Pan @ 2018-08-27 10:46 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Heiner Kallweit, Thomas Gleixner, David Miller,
	Realtek linux nic maintainers, netdev, Linux Kernel,
	Linux Upstreaming Team, linux-pci, marc.zyngier, hch
In-Reply-To: <20180823133805.GE154536@bhelgaas-glaptop.roam.corp.google.com>

2018-08-23 21:38 GMT+08:00 Bjorn Helgaas <helgaas@kernel.org>:
> On Thu, Aug 23, 2018 at 06:46:28PM +0800, Jian-Hong Pan wrote:
>> > On 22.08.2018 13:44, Thomas Gleixner wrote:
>> >> Can you please do the following:
>>
>> Tested on ASUS X441AUR equipped with RTL8106e.
>> This is the laptop whose ethernet does not come back after resume, if
>> it does not fallback to MSI.
>> ...
>
>> dev@endless:~$ sudo lspci -xnnvvs 02:00.0
>> ...
>> 00: ec 10 36 81 07 04 10 00 07 00 00 02 10 00 00 00
>> 10: 01 e0 00 00 00 00 00 00 04 00 10 ef 00 00 00 00
>> 20: 0c 00 00 e0 00 00 00 00 00 00 00 00 43 10 0f 20
>> 30: 00 00 00 00 40 00 00 00 00 00 00 00 ff 01 00 00
>>
>> After comparing, there is no difference between before suspend and
>> after resume.
>
> It'd be better to compare the hex data directly and ignore the lspci
> decoding, since lspci doesn't decode everything.  You only dumped the
> first 0x40 bytes of config space, and all capabilities, including the
> MSI and MSI-X capabilities, are past that:
>
>> Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
>> Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
>> Vector table: BAR=4 offset=00000000
>> PBA: BAR=4 offset=00000800
>
> In addition, some of the MSI-X information for this device is in BAR
> 4.  "lspci -xxx" will dump all config space, and you can use a tool
> like http://cmp.felk.cvut.cz/~pisa/linux/rdwrmem.c or
> https://github.com/billfarrow/pcimem to dump the BAR contents.

Tested on ASUS X441AUR equipped with RTL8106e without fallbacking to MSI again.
Use lspci and https://github.com/billfarrow/pcimem

Here is the status before suspend:

dev@endless:~$ sudo lspci -xxxs 02:00.0
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller (rev 07)
00: ec 10 36 81 07 04 10 00 07 00 00 02 10 00 00 00
10: 01 e0 00 00 00 00 00 00 04 00 10 ef 00 00 00 00
20: 0c 00 00 e0 00 00 00 00 00 00 00 00 43 10 0f 20
30: 00 00 00 00 40 00 00 00 00 00 00 00 ff 01 00 00
40: 01 50 c3 ff 08 00 00 00 00 00 00 00 00 00 00 00
50: 05 70 80 00 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 10 b0 02 02 c0 8d 90 05 10 20 10 00 11 7c 47 00
80: 42 01 11 10 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 1f 08 0c 00 00 04 00 00 02 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 11 d0 03 80 04 00 00 00 04 08 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

dev@endless:~$ sudo ~/pcimem/pcimem
/sys/devices/pci0000\:00/0000\:00\:1c.4/0000\:02\:00.0/resource4 0
b*16384
[sudo] password for dev:
/sys/devices/pci0000:00/0000:00:1c.4/0000:02:00.0/resource4 opened.
Target offset is 0x0, page size is 4096
mmap(0, 16384, 0x3, 0x1, 3, 0x0)
PCI Memory mapped to address 0x7f15186d1000.
0x0000: 0x38
0x0001: 0x03
0x0002: 0xE0
0x0003: 0xFE
0x0004: 0x00
...
0x0010: 0x41
0x0011: 0x72
.
.
.
0x003C: 0x01
0x003D: 0x00
...
0x1000: 0x38
0x1001: 0x03
.
.
.

After resume:

dev@endless:~$ sudo lspci -xxxs 02:00.0
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller (rev 07)
00: ec 10 36 81 07 04 10 00 07 00 00 02 10 00 00 00
10: 01 e0 00 00 00 00 00 00 04 00 10 ef 00 00 00 00
20: 0c 00 00 e0 00 00 00 00 00 00 00 00 43 10 0f 20
30: 00 00 00 00 40 00 00 00 00 00 00 00 ff 01 00 00
40: 01 50 c3 ff 08 00 00 00 00 00 00 00 00 00 00 00
50: 05 70 80 00 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 10 b0 02 02 c0 8d 90 05 10 20 10 00 11 7c 47 00
80: 42 01 11 10 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 1f 08 0c 00 00 04 00 00 02 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 11 d0 03 80 04 00 00 00 04 08 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

dev@endless:~$ sudo ~/pcimem/pcimem
/sys/devices/pci0000\:00/0000\:00\:1c.4/0000\:02\:00.0/resource4 0
b*16384
/sys/devices/pci0000:00/0000:00:1c.4/0000:02:00.0/resource4 opened.
Target offset is 0x0, page size is 4096
mmap(0, 16384, 0x3, 0x1, 3, 0x0)
PCI Memory mapped to address 0x7f8d68dd5000.
0x0000: 0xFF
...


The config is the same, but values in BAR=4 is weird after resume.
They all become 0xFF.

Regards,
Jian-Hong Pan

^ permalink raw reply

* Re: followup: what's responsible for setting netdev->operstate to IF_OPER_DOWN?
From: Robert P. J. Day @ 2018-08-27  6:22 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Linux kernel netdev mailing list
In-Reply-To: <20180826135006.157d1bc2@xeon-e3>

On Sun, 26 Aug 2018, Stephen Hemminger wrote:

> On Sun, 26 Aug 2018 11:14:33 -0400 (EDT)
> "Robert P. J. Day" <rpjday@crashcourse.ca> wrote:
>
> >   apologies for the constant pleas for assistance, but i think i'm
> > zeroing in on the problem that started all this. recap: custom
> > FPGA-based linux box with multiple ports, where the current symptom is
> > that there is no userspace notification when someone simply unplugs
> > one of the ports ("ifconfig" shows that interface still RUNNING).
> >
> >   as i read it, an active ethernet interface should be both UP (the
> > administrative state) and RUNNING (the RFC 2863-defined operational
> > state). if i unplug, i've verified on a standard net port on my laptop
> > that the interface is still UP, but no longer RUNNING, which makes
> > perfect sense. i plug back in, interface starts RUNNING again. so
> > where's the problem?
> >
> >   i can see that whether ifconfig shows an interface RUNNING is
> > defined in net/core/dev.c:
> >
> >   unsigned int dev_get_flags(const struct net_device *dev)
> >   {
> >         unsigned int flags;
> >
> >         flags = (dev->flags & ~(IFF_PROMISC |
> >                                 IFF_ALLMULTI |
> >                                 IFF_RUNNING |
> >                                 IFF_LOWER_UP |
> >                                 IFF_DORMANT)) |
> >                 (dev->gflags & (IFF_PROMISC |
> >                                 IFF_ALLMULTI));
> >
> >         if (netif_running(dev)) {
> >                 if (netif_oper_up(dev))
> >                         flags |= IFF_RUNNING;  <---- THERE
> >                 if (netif_carrier_ok(dev))
> >                         flags |= IFF_LOWER_UP;
> >                 if (netif_dormant(dev))
> >                         flags |= IFF_DORMANT;
> >         }
> >
> >         return flags;
> >   }
> >
> > where netif_oper_up() is defined as:
> >
> >   static inline bool netif_oper_up(const struct net_device *dev)
> >   {
> >         return (dev->operstate == IF_OPER_UP ||
> >                 dev->operstate == IF_OPER_UNKNOWN /* backward compat */);
> >   }
> >
> > so i am simply assuming that the underlying problem is that,
> > somewhere down below, the unplugging of a port is somehow not setting
> > dev->operstate to its proper value of IF_OPER_DOWN.
> >
> >   that would clearly explain everything, and i'm about to dig even
> > further to see where the event of unplugging a port *should* be
> > recognized, but does this sound like a reasonable diagnosis? there
> > have been other problems with the programming of the FPGA, so it would
> > surprise absolutely no one to learn that this aspect was
> > misprogrammed.
> >
> > rday
> >
>
> There is no reason drivers should ever muck with flags directly.
> You probably are looking for netif_detach

  i assume you mean netif_device_detach; i'll check into that.

rday

-- 

========================================================================
Robert P. J. Day                                 Ottawa, Ontario, CANADA
                  http://crashcourse.ca/dokuwiki

Twitter:                                       http://twitter.com/rpjday
LinkedIn:                               http://ca.linkedin.com/in/rpjday
========================================================================

^ permalink raw reply

* Re: confusing comment, explanation of @IFF_RUNNING in if.h
From: Robert P. J. Day @ 2018-08-27  6:20 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Andrew Lunn, Linux kernel netdev mailing list
In-Reply-To: <20180826135144.11fd9a5f@xeon-e3>

On Sun, 26 Aug 2018, Stephen Hemminger wrote:

> On Sun, 26 Aug 2018 15:20:24 -0400 (EDT)
> "Robert P. J. Day" <rpjday@crashcourse.ca> wrote:
>
> > On Sun, 26 Aug 2018, Andrew Lunn wrote:
> >
> > > >   i ask since, in my testing, when the interface should have been
> > > > up, the attribute file "operstate" for that interface showed
> > > > "unknown", and i wondered how worried i should be about that.
> > >
> > > Hi Robert
> > >
> > > You should probably post the driver for review. A well written
> > > driver should not even need to care about any of this. phylib and
> > > the netdev driver code does all the work. It only gets interesting
> > > when you don't have a PHY, e.g. a stacked device, like bonding, or a
> > > virtual device like tun/tap.
> >
> >   i wish, but i'm on contract, and proprietary, and NDA and all that.
> > so i am reduced to crawling through the code, trying to figure out
> > what is misconfigured that is causing all this grief.
> >
> > rday
> >
>
> So you expect FOSS developers to help you with proprietary licensed
> driver. Good Luck with that.

  sorry, i'm sure this will all be released upon production, just not
while it's in the midst of development.

rday

-- 

========================================================================
Robert P. J. Day                                 Ottawa, Ontario, CANADA
                  http://crashcourse.ca/dokuwiki

Twitter:                                       http://twitter.com/rpjday
LinkedIn:                               http://ca.linkedin.com/in/rpjday
========================================================================

^ permalink raw reply

* Re: [PATCH v2 2/2] can: rcar: use SPDX identifier for Renesas drivers
From: Marc Kleine-Budde @ 2018-08-27 10:00 UTC (permalink / raw)
  To: Wolfram Sang, linux-renesas-soc
  Cc: Kuninori Morimoto, Wolfgang Grandegger, David S. Miller,
	linux-can, netdev, linux-kernel
In-Reply-To: <20180823133456.4748-3-wsa+renesas@sang-engineering.com>


[-- Attachment #1.1: Type: text/plain, Size: 610 bytes --]

On 08/23/2018 03:34 PM, Wolfram Sang wrote:
> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

Applied to linux-can-next. Please add a patch description to the patch.
My $UPSTREAM doesn't like empty patch descriptions :) I've shamelessly
used Fabio Estevam patch description from his flexcan SPDX patch.

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH v2 01/29] nvmem: add support for cell lookups
From: Boris Brezillon @ 2018-08-27  9:00 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: Andrew Lunn, linux-doc, Sekhar Nori, Bartosz Golaszewski,
	Srinivas Kandagatla, linux-i2c, Mauro Carvalho Chehab,
	Rob Herring, Florian Fainelli, Kevin Hilman, Richard Weinberger,
	Russell King, Marek Vasut, Paolo Abeni, Dan Carpenter,
	Grygorii Strashko, David Lechner, Arnd Bergmann,
	Sven Van Asbroeck, "ope
In-Reply-To: <CAMRc=Men-MPk5DGshWcVEc0v=gH2WSpx1j-CawOeydwp59tejw@mail.gmail.com>

On Mon, 27 Aug 2018 10:56:29 +0200
Bartosz Golaszewski <brgl@bgdev.pl> wrote:

> 2018-08-25 8:27 GMT+02:00 Boris Brezillon <boris.brezillon@bootlin.com>:
> > On Fri, 24 Aug 2018 17:27:40 +0200
> > Andrew Lunn <andrew@lunn.ch> wrote:
> >  
> >> On Fri, Aug 24, 2018 at 05:08:48PM +0200, Boris Brezillon wrote:  
> >> > Hi Bartosz,
> >> >
> >> > On Fri, 10 Aug 2018 10:04:58 +0200
> >> > Bartosz Golaszewski <brgl@bgdev.pl> wrote:
> >> >  
> >> > > +struct nvmem_cell_lookup {
> >> > > + struct nvmem_cell_info  info;
> >> > > + struct list_head        list;
> >> > > + const char              *nvmem_name;
> >> > > +};  
> >> >
> >> > Hm, maybe I don't get it right, but this looks suspicious. Usually the
> >> > consumer lookup table is here to attach device specific names to
> >> > external resources.
> >> >
> >> > So what I'd expect here is:
> >> >
> >> > struct nvmem_cell_lookup {
> >> >     /* The nvmem device name. */
> >> >     const char *nvmem_name;
> >> >
> >> >     /* The nvmem cell name */
> >> >     const char *nvmem_cell_name;
> >> >
> >> >     /*
> >> >      * The local resource name. Basically what you have in the
> >> >      * nvmem-cell-names prop.
> >> >      */
> >> >     const char *conid;
> >> > };
> >> >
> >> > struct nvmem_cell_lookup_table {
> >> >     struct list_head list;
> >> >
> >> >     /* ID of the consumer device. */
> >> >     const char *devid;
> >> >
> >> >     /* Array of cell lookup entries. */
> >> >     unsigned int ncells;
> >> >     const struct nvmem_cell_lookup *cells;
> >> > };
> >> >
> >> > Looks like your nvmem_cell_lookup is more something used to attach cells
> >> > to an nvmem device, which is NVMEM provider's responsibility not the
> >> > consumer one.  
> >>
> >> Hi Boris
> >>
> >> There are cases where there is not a clear providier/consumer split. I
> >> have an x86 platform, with a few at24 EEPROMs on it. It uses an off
> >> the shelf Komtron module, placed on a custom carrier board. One of the
> >> EEPROMs contains the hardware variant information. Once i know the
> >> variant, i need to instantiate other I2C, SPI, MDIO devices, all using
> >> platform devices, since this is x86, no DT available.
> >>
> >> So the first thing my x86 platform device does is instantiate the
> >> first i2c device for the AT24. Once the EEPROM pops into existence, i
> >> need to add nvmem cells onto it. So at that point, the x86 platform
> >> driver is playing the provider role. Once the cells are added, i can
> >> then use nvmem consumer interfaces to get the contents of the cell,
> >> run a checksum, and instantiate the other devices.
> >>
> >> I wish the embedded world was all DT, but the reality is that it is
> >> not :-(  
> >
> > Actually, I'm not questioning the need for this feature (being able to
> > attach NVMEM cells to an NVMEM device on a platform that does not use
> > DT). What I'm saying is that this functionality is provider related,
> > not consumer related. Also, I wonder if defining such NVMEM cells
> > shouldn't go through the provider driver instead of being passed
> > directly to the NVMEM layer, because nvmem_config already have a fields
> > to pass cells at registration time, plus, the name of the NVMEM cell
> > device is sometimes created dynamically and can be hard to guess at
> > platform_device registration time.
> >  
> 
> In my use case the provider is at24 EEPROM driver. This is where the
> nvmem_config lives but I can't image a correct and clean way of
> passing this cell config to the driver from board files without using
> new ugly fields in platform_data which this very series is trying to
> remove. This is why this cell config should live in machine code.

Okay.

> 
> > I also think non-DT consumers will need a way to reference exiting
> > NVMEM cells, but this consumer-oriented nvmem cell lookup table should
> > look like the gpio or pwm lookup table (basically what I proposed in my
> > previous email).  
> 
> How about introducing two new interfaces to nvmem: one for defining
> nvmem cells from machine code and the second for connecting these
> cells with devices?

Yes, that's basically what I was suggesting: move what you've done in
nvmem-provider.h (maybe rename some of the structs to make it clear
that this is about defining cells not referencing existing ones), and
add a new consumer interface (based on what other subsystems do) in
nvmem-consumer.h.

This way you have both things clearly separated, and if a driver is
both a consumer and a provider you'll just have to include both headers.

Regards,

Boris

^ permalink raw reply

* Re: [PATCH v2 01/29] nvmem: add support for cell lookups
From: Bartosz Golaszewski @ 2018-08-27  8:56 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Andrew Lunn, linux-doc, Sekhar Nori, Bartosz Golaszewski,
	Srinivas Kandagatla, linux-i2c, Mauro Carvalho Chehab,
	Rob Herring, Florian Fainelli, Kevin Hilman, Richard Weinberger,
	Russell King, Marek Vasut, Paolo Abeni, Dan Carpenter,
	Grygorii Strashko, David Lechner, Arnd Bergmann,
	Sven Van Asbroeck, "ope
In-Reply-To: <20180825082722.567e8c9a@bbrezillon>

2018-08-25 8:27 GMT+02:00 Boris Brezillon <boris.brezillon@bootlin.com>:
> On Fri, 24 Aug 2018 17:27:40 +0200
> Andrew Lunn <andrew@lunn.ch> wrote:
>
>> On Fri, Aug 24, 2018 at 05:08:48PM +0200, Boris Brezillon wrote:
>> > Hi Bartosz,
>> >
>> > On Fri, 10 Aug 2018 10:04:58 +0200
>> > Bartosz Golaszewski <brgl@bgdev.pl> wrote:
>> >
>> > > +struct nvmem_cell_lookup {
>> > > + struct nvmem_cell_info  info;
>> > > + struct list_head        list;
>> > > + const char              *nvmem_name;
>> > > +};
>> >
>> > Hm, maybe I don't get it right, but this looks suspicious. Usually the
>> > consumer lookup table is here to attach device specific names to
>> > external resources.
>> >
>> > So what I'd expect here is:
>> >
>> > struct nvmem_cell_lookup {
>> >     /* The nvmem device name. */
>> >     const char *nvmem_name;
>> >
>> >     /* The nvmem cell name */
>> >     const char *nvmem_cell_name;
>> >
>> >     /*
>> >      * The local resource name. Basically what you have in the
>> >      * nvmem-cell-names prop.
>> >      */
>> >     const char *conid;
>> > };
>> >
>> > struct nvmem_cell_lookup_table {
>> >     struct list_head list;
>> >
>> >     /* ID of the consumer device. */
>> >     const char *devid;
>> >
>> >     /* Array of cell lookup entries. */
>> >     unsigned int ncells;
>> >     const struct nvmem_cell_lookup *cells;
>> > };
>> >
>> > Looks like your nvmem_cell_lookup is more something used to attach cells
>> > to an nvmem device, which is NVMEM provider's responsibility not the
>> > consumer one.
>>
>> Hi Boris
>>
>> There are cases where there is not a clear providier/consumer split. I
>> have an x86 platform, with a few at24 EEPROMs on it. It uses an off
>> the shelf Komtron module, placed on a custom carrier board. One of the
>> EEPROMs contains the hardware variant information. Once i know the
>> variant, i need to instantiate other I2C, SPI, MDIO devices, all using
>> platform devices, since this is x86, no DT available.
>>
>> So the first thing my x86 platform device does is instantiate the
>> first i2c device for the AT24. Once the EEPROM pops into existence, i
>> need to add nvmem cells onto it. So at that point, the x86 platform
>> driver is playing the provider role. Once the cells are added, i can
>> then use nvmem consumer interfaces to get the contents of the cell,
>> run a checksum, and instantiate the other devices.
>>
>> I wish the embedded world was all DT, but the reality is that it is
>> not :-(
>
> Actually, I'm not questioning the need for this feature (being able to
> attach NVMEM cells to an NVMEM device on a platform that does not use
> DT). What I'm saying is that this functionality is provider related,
> not consumer related. Also, I wonder if defining such NVMEM cells
> shouldn't go through the provider driver instead of being passed
> directly to the NVMEM layer, because nvmem_config already have a fields
> to pass cells at registration time, plus, the name of the NVMEM cell
> device is sometimes created dynamically and can be hard to guess at
> platform_device registration time.
>

In my use case the provider is at24 EEPROM driver. This is where the
nvmem_config lives but I can't image a correct and clean way of
passing this cell config to the driver from board files without using
new ugly fields in platform_data which this very series is trying to
remove. This is why this cell config should live in machine code.

> I also think non-DT consumers will need a way to reference exiting
> NVMEM cells, but this consumer-oriented nvmem cell lookup table should
> look like the gpio or pwm lookup table (basically what I proposed in my
> previous email).

How about introducing two new interfaces to nvmem: one for defining
nvmem cells from machine code and the second for connecting these
cells with devices?

Best regards,
Bart

^ permalink raw reply

* Re: oops with ip6_rt_cache_alloc
From: Yonghong Song @ 2018-08-27  4:57 UTC (permalink / raw)
  To: David Ahern, netdev, Alexei Starovoitov, Martin Lau, Dave Jones
In-Reply-To: <2314c9c2-27ab-c470-5e8a-4e28e53810b2@gmail.com>



On 8/24/18 4:04 PM, David Ahern wrote:
> On 8/24/18 4:26 PM, Yonghong Song wrote:
>> Hi,
>>
>> We got a kernel oops with the following stack trace:
>>
>> CPU: 24 PID: 0 Comm: swapper/24 Not tainted
>> 4.16.0-10_fbk1_1183_g7e4ee4c8171c #10
>> "Hardware name: Quanta Leopard-DDR3/Leopard-DDR3, BIOS F06_3A16.DDR3
>> 11/19/2015"
>> RIP: 0010:ip6_rt_get_dev_rcu+0x6/0x60
>> RSP: 0018:ffff88046fb03c78 EFLAGS: 00010286
>> RAX: 0000000040000003 RBX: ffff88035a6c1500 RCX: ffffffff81ec5dc0
>> RDX: ffff88033192a090 RSI: ffff88033192a0a0 RDI: 0000000000000000
> 
> RDI = 0 means the rt passed to ip6_rt_get_dev_rcu is NULL. I believe
> that can't happen prior to the fib6_info changes. After the fib6_info
> changes, it means the 'from' is NULL and that is not expected.
> 
> ...
> 
>> Our internal experiments showed that an early version of 4.16 works fine
>> and after backporting some ipv6 route related changes and the above
>> problem showed up.
> 
> Can you run the test on 4.18?

We will give a try with 4.18. Thanks.

^ permalink raw reply

* Re: [RFC v3 net-next 3/5] ebpf: fix bpf_msg_pull_data
From: Tushar Dave @ 2018-08-27  4:45 UTC (permalink / raw)
  To: John Fastabend, ast, daniel, davem, sowmini.varadhan,
	santosh.shilimkar, jakub.kicinski, quentin.monnet, jiong.wang,
	sandipan, kafai, rdna, yhs, netdev
In-Reply-To: <e3e03edf-9771-7660-b27c-fc28be55c644@gmail.com>



On 08/24/2018 06:02 PM, John Fastabend wrote:
> On 08/17/2018 04:08 PM, Tushar Dave wrote:
>> Like sockmap (sk_msg), socksg also deals with struct scatterlist
>> therefore socksg programs can use existing bpf helper bpf_msg_pull_data
>> to access packet data contained in struct scatterlist. While doing some
>> prelimnary testing, there are couple of issues found with
>> bpf_msg_pull_data that are fixed in this patch.
>>
>> Also, there cannot be more than MAX_SKB_FRAGS entries in sg_data
>> therefore any checks for sg entry more than MAX_SKB_FRAGS in
>> bpf_msg_pull_data() is removed.
> 
> In sockmap the scatterlist is used as a ring so the MAX_SKB_FRAGS
> check is needed to keep searching through the ring when sg_start
> is non-zero.

Okay.

> 
>>
>> Besides that, I also ran into issues while put_page() is invoked.
>> e.g.
>> [ 450.568723] BUG: Bad page state in process swapper/10 pfn:2021540
>> [ 450.575632] page:ffffea0080855000 count:0 mapcount:0
>> mapping:ffff88103d006840 index:0xffff882021540000 compound_mapcount: 0
>> [ 450.588069] flags: 0x6fffff80008100(slab|head)
>> [ 450.593033] raw: 006fffff80008100 dead000000000100 dead000000000200
>> ffff88103d006840
>> [ 450.601683] raw: ffff882021540000 0000000080080007 00000000ffffffff
>> 0000000000000000
>> [ 450.610337] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
>> [ 450.617530] bad because of flags: 0x100(slab)
>>
>> To avoid above issue, currently put_page() is disabled in this patch
>> temporarily. I am working on alternatives so that page allocated via
>> slab (in this case) can be freed without any issue.>
>> Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
>> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
>> ---
>>   net/core/filter.c | 61 +++++++++++++++++++++++++++++--------------------------
>>   1 file changed, 32 insertions(+), 29 deletions(-)
>>
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index e427c8e..cc52baa 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -2316,7 +2316,7 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>   BPF_CALL_4(bpf_msg_pull_data,
>>   	   struct sk_msg_buff *, msg, u32, start, u32, end, u64, flags)
>>   {
>> -	unsigned int len = 0, offset = 0, copy = 0;
>> +	unsigned int len = 0, offset = 0, copy = 0, off = 0;
>>   	struct scatterlist *sg = msg->sg_data;
>>   	int first_sg, last_sg, i, shift;
>>   	unsigned char *p, *to, *from;
>> @@ -2330,22 +2330,28 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>   	i = msg->sg_start;
>>   	do {
>>   		len = sg[i].length;
>> -		offset += len;
>>   		if (start < offset + len)
>>   			break;
>> +		offset += len;
> 
> This looks like a generic fix unrelated to this series.
> Can you send that as a bugfix?

Okay.

> 
>>   		i++;
>> -		if (i == MAX_SKB_FRAGS)
>> -			i = 0;
>> -	} while (i != msg->sg_end);
>> +	} while (i <= msg->sg_end);
>>   
> 
> As noted above the MAX_SKB_FRAGS check is needed because
> sg_start can be non-zero and sg_end < st_start. In these
> cases we need to search the entries at the start of the
> array (being used as a ring).

Yup!

> 
>> +	/* return error if start is out of range */
>>   	if (unlikely(start >= offset + len))
>>   		return -EINVAL;
>>   
>> -	if (!msg->sg_copy[i] && bytes <= len)
>> -		goto out;
>> +	/* return error if i is last entry in sglist and end is out of range */
>> +	if (msg->sg_copy[i] && end > offset + len)
>> +		return -EINVAL>
>>   	first_sg = i;
>>   
>> +	/* if i is not last entry in sg list and end (i.e start + bytes) is
>> +	 * within this sg[i] then goto out and calculate data and data_end
>> +	 */
>> +	if (!msg->sg_copy[i] && end <= offset + len)
>> +		goto out;
>> +>  	/* At this point we need to linearize multiple scatterlist
>>   	 * elements or a single shared page. Either way we need to
>>   	 * copy into a linear buffer exclusively owned by BPF. Then
>> @@ -2359,11 +2365,14 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>   	do {
>>   		copy += sg[i].length;
>>   		i++;
>> -		if (i == MAX_SKB_FRAGS)
>> -			i = 0;
> 
> same as above, need to keep.

Yup!

> 
>> -		if (bytes < copy)
>> +		if (end < copy)
>>   			break;
>> -	} while (i != msg->sg_end);
>> +	} while (i <= msg->sg_end);
>> +
>> +	/* return error if i is last entry in sglist and end is out of range */
>> +	if (i > msg->sg_end && end > offset + copy)
>> +		return -EINVAL;
>> +
>>   	last_sg = i;
>>   
>>   	if (unlikely(copy < end - start))
>> @@ -2373,23 +2382,25 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>   	if (unlikely(!page))
>>   		return -ENOMEM;
>>   	p = page_address(page);
>> -	offset = 0;
>>   
>>   	i = first_sg;
>>   	do {
>>   		from = sg_virt(&sg[i]);
>>   		len = sg[i].length;
>> -		to = p + offset;
>> +		to = p + off;
> 
> Not really sure if the change from offset->off is needed. Looks
> like it just makes a bigger diff.

We need both offset and off because they both are used for different
calculations!

'offset' is used to calculate the 'msg->data'
i.e. msg->data = sg_virt(&sg[first_sg]) + start - offset"

'off' , on the other hand, is used for when we linearize sg.

> 
>>   
>>   		memcpy(to, from, len);
>> -		offset += len;
>> +		off += len;
>>   		sg[i].length = 0;
>> -		put_page(sg_page(&sg[i]));
>> +		/* if original page is allocated via slab then put_page
>> +		 * causes error BUG: Bad page state in process. So temporarily
>> +		 * disabled put_page.
>> +		 * Todo: fix it
>> +		 */
>> +		//put_page(sg_page(&sg[i]));

As I said in the commit message that put_page() causes error "BUG: Bad
page state in process ..." when used for RDS.
Any clue? Have you seen something like this with sockmap?


>>   
>>   		i++;
>> -		if (i == MAX_SKB_FRAGS)
>> -			i = 0;
>> -	} while (i != last_sg);
>> +	} while (i < last_sg);
>>   
>>   	sg[first_sg].length = copy;
>>   	sg_set_page(&sg[first_sg], page, copy, 0);
>> @@ -2406,12 +2417,8 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>   	do {
>>   		int move_from;
>>   
>> -		if (i + shift >= MAX_SKB_FRAGS)
>> -			move_from = i + shift - MAX_SKB_FRAGS;
>> -		else
>> -			move_from = i + shift;
>> -
> 
> Need to keep same as above.
yup!

> 
>> -		if (move_from == msg->sg_end)
>> +		move_from = i + shift;> +		if (move_from > msg->sg_end)
>>   			break;
>>   
>>   		sg[i] = sg[move_from];
>> @@ -2420,14 +2427,10 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>   		sg[move_from].offset = 0;
>>   
>>   		i++;
>> -		if (i == MAX_SKB_FRAGS)
>> -			i = 0;
>>   	} while (1);
>>   	msg->sg_end -= shift;
>> -	if (msg->sg_end < 0)
>> -		msg->sg_end += MAX_SKB_FRAGS;
>>   out:
>> -	msg->data = sg_virt(&sg[i]) + start - offset;
>> +	msg->data = sg_virt(&sg[first_sg]) + start - offset;
>>   	msg->data_end = msg->data + bytes;
>>   
>>   	return 0;
>>
> 
> Thanks,
> John
> 

Thanks.
-Tushar

^ permalink raw reply

* Re: [PATCH] net: sched: Fix memory exposure from short TCA_U32_SEL
From: Julia Lawall @ 2018-08-27  4:41 UTC (permalink / raw)
  To: Al Viro
  Cc: Joe Perches, Kees Cook, LKML, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, David S. Miller, Network Development
In-Reply-To: <20180827040423.GB6515@ZenIV.linux.org.uk>



On Mon, 27 Aug 2018, Al Viro wrote:

> On Sun, Aug 26, 2018 at 11:35:17PM -0400, Julia Lawall wrote:
>
> > * x = \(kmalloc\|kzalloc\|devm_kmalloc\|devm_kzalloc\)(...)
>
> I can name several you've missed right off the top of my head -
> vmalloc, kvmalloc, kmem_cache_alloc, kmem_cache_zalloc, variants
> with _trace slapped on, and that is not to mention the things like
> get_free_page or

OK, maybe for a given type the set of functions would be smaller.

>
> void *my_k3wl_alloc(u64 n) // 'cause all artificial limits suck, that's why
> {
> 	lots and lots of home-grown stats collection
> 	some tracepoints thrown in just for fun
> 	return kmalloc(n);
> }
>
> (and no, I'm not implying that net/sched folks had done anything of that
> sort; I have seen that and worse in drivers, though)
>
> > The * at the beginning of the line means to highlight what you are looking
> > for, which is done by making a diff in which the highlighted line
> > appears to be removed.
>
> Umm...  Does that cover return, BTW?  Or something like
> 	T *barf;
> 	extern void foo(T *p);
> 	foo(kmalloc(sizeof(*barf)));

It only covers the pattern that is shown, ie an assignment.  For this,
another pattern would be needed.  It would be necessary to match first the
call that one is concerned with and then go find the function definition
or prototype to find the type of the associated parameter.  It is possible
to count the offset of the kmalloc call in the argument list and then get
the type at the corresponding offset in the parameter list of the function
declaration or prototype.

>
>
> > The limitation is the ability to figure out the type of x.  If it is a
> > local variable, Coccinelle should have no problem.  If it is a structure
> > field, it may be necessary to provide command line arguments like
> >
> > --all-includes --include-headers-for-types
> >
> > --all-includes means to try to find all include files that are mentioned
> > in the .c file.  The next stronger option is --recursive includes, which
> > means include what all of the mentioned files include as well,
> > recursively.  This tends to cause a major performance hit, because a lot
> > of code is being parsed.  --include-headers-for-types heals a bit with
> > that, as it only considers the header files when computing type
> > information, and now when applying the rules.
> >
> > With respect to ifdefs around variable declarations and structure field
> > declaration, in these cases Coccinelle considers that it cannot make the
> > ifdef have an if-like control flow, and so if considers the #ifdef, #else
> > and #endif to be comments.  Thus it takes into account only the last type
> > provided for a given variable.
>
> [snip]
>
> What about several variants of structure definition?  Because ifdefs around
> includes do occur in the wild...

Such ifdefs would be ignored completely.  I suspect that only the last
definition of the structure would be taken into account.

julia

^ permalink raw reply

* [PATCH v2 iproute2-next 3/3] q_netem: slotting with non-uniform distribution
From: Yousuk Seung @ 2018-08-27  2:42 UTC (permalink / raw)
  To: netdev
  Cc: Stephen Hemminger, David Ahern, Michael McLennan, Priyaranjan Jha,
	Yousuk Seung, Neal Cardwell, Dave Taht
In-Reply-To: <20180827024230.246445-1-ysseung@google.com>

Extend slotting with support for non-uniform distributions. This is
similar to netem's non-uniform distribution delay feature.

Syntax:
   slot distribution DISTRIBUTION DELAY JITTER [packets MAX_PACKETS] \
      [bytes MAX_BYTES]

The syntax and use of the distribution table is the same as in the
non-uniform distribution delay feature. A file DISTRIBUTION must be
present in TC_LIB_DIR (e.g. /usr/lib/tc) containing numbers scaled by
NETEM_DIST_SCALE. A random value x is selected from the table and it
takes DELAY + ( x * JITTER ) as delay. Correlation between values is not
supported.

Examples:
  Normal distribution delay with mean = 800us and stdev = 100us.
  > tc qdisc add dev eth0 root netem slot distribution normal \
    800us 100us

  Optionally set the max slot size in bytes and/or packets.
  > tc qdisc add dev eth0 root netem slot distribution normal \
    800us 100us bytes 64k packets 42

Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Dave Taht <dave.taht@gmail.com>
---
 man/man8/tc-netem.8 | 20 ++++++++----
 tc/q_netem.c        | 77 +++++++++++++++++++++++++++++++++++++--------
 2 files changed, 78 insertions(+), 19 deletions(-)

diff --git a/man/man8/tc-netem.8 b/man/man8/tc-netem.8
index 8d485b026751..111109cf042f 100644
--- a/man/man8/tc-netem.8
+++ b/man/man8/tc-netem.8
@@ -53,9 +53,13 @@ NetEm \- Network Emulator
 .IR RATE " [ " PACKETOVERHEAD " [ " CELLSIZE " [ " CELLOVERHEAD " ]]]]"
 
 .IR SLOT " := "
-.BR slot
-.IR MIN_DELAY " [ " MAX_DELAY " ] ["
-.BR packets
+.BR slot " { "
+.IR MIN_DELAY " [ " MAX_DELAY " ] |"
+.br
+.RB "               " distribution " { "uniform " | " normal " | " pareto " | " paretonormal " | "
+.IR FILE " } " DELAY " " JITTER " } "
+.br
+.RB "             [ " packets
 .IR PACKETS " ] [ "
 .BR bytes
 .IR BYTES " ]"
@@ -172,9 +176,13 @@ an artificial packet compression (bursts). Another influence factor are network
 adapter buffers which can also add artificial delay.
 
 .SS slot
-defer delivering accumulated packets to within a slot, with each available slot
-configured with a minimum delay to acquire, and an optional maximum delay.  Slot
-delays can be specified in nanoseconds, microseconds, milliseconds or seconds
+defer delivering accumulated packets to within a slot. Each available slot can be
+configured with a minimum delay to acquire, and an optional maximum delay.
+Alternatively it can be configured with the distribution similar to
+.BR distribution
+for
+.BR delay
+option. Slot delays can be specified in nanoseconds, microseconds, milliseconds or seconds
 (e.g. 800us). Values for the optional parameters
 .I BYTES
 will limit the number of bytes delivered per slot, and/or
diff --git a/tc/q_netem.c b/tc/q_netem.c
index 53a7a1056f5d..5bfdfcd5478c 100644
--- a/tc/q_netem.c
+++ b/tc/q_netem.c
@@ -43,7 +43,9 @@ static void explain(void)
 "                 [ rate RATE [PACKETOVERHEAD] [CELLSIZE] [CELLOVERHEAD]]\n" \
 "                 [ slot MIN_DELAY [MAX_DELAY] [packets MAX_PACKETS]" \
 " [bytes MAX_BYTES]]\n" \
-		);
+"                 [ slot distribution" \
+" {uniform|normal|pareto|paretonormal|custom} DELAY JITTER" \
+" [packets MAX_PACKETS] [bytes MAX_BYTES]]\n");
 }
 
 static void explain1(const char *arg)
@@ -159,6 +161,7 @@ static int netem_parse_opt(struct qdisc_util *qu, int argc, char **argv,
 			   struct nlmsghdr *n, const char *dev)
 {
 	int dist_size = 0;
+	int slot_dist_size = 0;
 	struct rtattr *tail;
 	struct tc_netem_qopt opt = { .limit = 1000 };
 	struct tc_netem_corr cor = {};
@@ -169,6 +172,7 @@ static int netem_parse_opt(struct qdisc_util *qu, int argc, char **argv,
 	struct tc_netem_rate rate = {};
 	struct tc_netem_slot slot = {};
 	__s16 *dist_data = NULL;
+	__s16 *slot_dist_data = NULL;
 	__u16 loss_type = NETEM_LOSS_UNSPEC;
 	int present[__TCA_NETEM_MAX] = {};
 	__u64 rate64 = 0;
@@ -417,21 +421,55 @@ static int netem_parse_opt(struct qdisc_util *qu, int argc, char **argv,
 				}
 			}
 		} else if (matches(*argv, "slot") == 0) {
-			NEXT_ARG();
-			present[TCA_NETEM_SLOT] = 1;
-			if (get_time64(&slot.min_delay, *argv)) {
-				explain1("slot min_delay");
-				return -1;
-			}
 			if (NEXT_IS_NUMBER()) {
 				NEXT_ARG();
-				if (get_time64(&slot.max_delay, *argv) ||
-				    slot.max_delay < slot.min_delay) {
-					explain1("slot max_delay");
+				present[TCA_NETEM_SLOT] = 1;
+				if (get_time64(&slot.min_delay, *argv)) {
+					explain1("slot min_delay");
 					return -1;
 				}
+				if (NEXT_IS_NUMBER()) {
+					NEXT_ARG();
+					if (get_time64(&slot.max_delay, *argv) ||
+					    slot.max_delay < slot.min_delay) {
+						explain1("slot max_delay");
+						return -1;
+					}
+				} else {
+					slot.max_delay = slot.min_delay;
+				}
 			} else {
-				slot.max_delay = slot.min_delay;
+				NEXT_ARG();
+				if (strcmp(*argv, "distribution") == 0) {
+					present[TCA_NETEM_SLOT] = 1;
+					NEXT_ARG();
+					slot_dist_data = calloc(sizeof(slot_dist_data[0]), MAX_DIST);
+					if (!slot_dist_data)
+						return -1;
+					slot_dist_size = get_distribution(*argv, slot_dist_data, MAX_DIST);
+					if (slot_dist_size <= 0) {
+						free(slot_dist_data);
+						return -1;
+					}
+					NEXT_ARG();
+					if (get_time64(&slot.dist_delay, *argv)) {
+						explain1("slot delay");
+						return -1;
+					}
+					NEXT_ARG();
+					if (get_time64(&slot.dist_jitter, *argv)) {
+						explain1("slot jitter");
+						return -1;
+					}
+					if (slot.dist_jitter <= 0) {
+						fprintf(stderr, "Non-positive jitter\n");
+						return -1;
+					}
+				} else {
+					fprintf(stderr, "Unknown slot parameter: %s\n",
+						*argv);
+					return -1;
+				}
 			}
 			if (NEXT_ARG_OK() &&
 			    matches(*(argv+1), "packets") == 0) {
@@ -559,6 +597,14 @@ static int netem_parse_opt(struct qdisc_util *qu, int argc, char **argv,
 			return -1;
 		free(dist_data);
 	}
+
+	if (slot_dist_data) {
+		if (addattr_l(n, MAX_DIST * sizeof(slot_dist_data[0]),
+			      TCA_NETEM_SLOT_DIST,
+			      slot_dist_data, slot_dist_size * sizeof(slot_dist_data[0])) < 0)
+			return -1;
+		free(slot_dist_data);
+	}
 	tail->rta_len = (void *) NLMSG_TAIL(n) - (void *) tail;
 	return 0;
 }
@@ -713,8 +759,13 @@ static int netem_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
 	}
 
 	if (slot) {
-		fprintf(f, " slot %s", sprint_time64(slot->min_delay, b1));
-		fprintf(f, " %s", sprint_time64(slot->max_delay, b1));
+		if (slot->dist_jitter > 0) {
+		    fprintf(f, " slot distribution %s", sprint_time64(slot->dist_delay, b1));
+		    fprintf(f, " %s", sprint_time64(slot->dist_jitter, b1));
+		} else {
+		    fprintf(f, " slot %s", sprint_time64(slot->min_delay, b1));
+		    fprintf(f, " %s", sprint_time64(slot->max_delay, b1));
+		}
 		if(slot->max_packets)
 			fprintf(f, " packets %d", slot->max_packets);
 		if(slot->max_bytes)
-- 
2.19.0.rc0.228.g281dcd1b4d0-goog

^ permalink raw reply related

* [PATCH v2 iproute2-next 2/3] q_netem: support delivering packets in delayed time slots
From: Yousuk Seung @ 2018-08-27  2:42 UTC (permalink / raw)
  To: netdev
  Cc: Stephen Hemminger, David Ahern, Michael McLennan, Priyaranjan Jha,
	Dave Taht, Yousuk Seung, Neal Cardwell
In-Reply-To: <20180827024230.246445-1-ysseung@google.com>

From: Dave Taht <dave.taht@gmail.com>

Slotting is a crude approximation of the behaviors of shared media such
as cable, wifi, and LTE, which gather up a bunch of packets within a
varying delay window and deliver them, relative to that, nearly all at
once.

It works within the existing loss, duplication, jitter and delay
parameters of netem. Some amount of inherent latency must be specified,
regardless.

The new "slot" parameter specifies a minimum and maximum delay between
transmission attempts.

The "bytes" and "packets" parameters can be used to limit the amount of
information transferred per slot.

Examples of use:

tc qdisc add dev eth0 root netem delay 200us \
        slot 800us 10ms bytes 64k packets 42

A more correct example, using stacked netem instances and a packet limit
to emulate a tail drop wifi queue with slots and variable packet
delivery, with a 200Mbit isochronous underlying rate, and 20ms path
delay:

tc qdisc add dev eth0 root handle 1: netem delay 20ms rate 200mbit \
         limit 10000
tc qdisc add dev eth0 parent 1:1 handle 10:1 netem delay 200us \
         slot 800us 10ms bytes 64k packets 42 limit 512

Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
---
 man/man8/tc-netem.8 | 32 ++++++++++++++++++++++-
 tc/q_netem.c        | 64 ++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 94 insertions(+), 2 deletions(-)

diff --git a/man/man8/tc-netem.8 b/man/man8/tc-netem.8
index f2cd86b6ed8a..8d485b026751 100644
--- a/man/man8/tc-netem.8
+++ b/man/man8/tc-netem.8
@@ -8,7 +8,8 @@ NetEm \- Network Emulator
 .I OPTIONS
 
 .IR OPTIONS " := [ " LIMIT " ] [ " DELAY " ] [ " LOSS \
-" ] [ " CORRUPT " ] [ " DUPLICATION " ] [ " REORDERING " ][ " RATE " ]"
+" ] [ " CORRUPT " ] [ " DUPLICATION " ] [ " REORDERING " ] [ " RATE \
+" ] [ " SLOT " ]"
 
 .IR LIMIT " := "
 .B limit
@@ -51,6 +52,14 @@ NetEm \- Network Emulator
 .B rate
 .IR RATE " [ " PACKETOVERHEAD " [ " CELLSIZE " [ " CELLOVERHEAD " ]]]]"
 
+.IR SLOT " := "
+.BR slot
+.IR MIN_DELAY " [ " MAX_DELAY " ] ["
+.BR packets
+.IR PACKETS " ] [ "
+.BR bytes
+.IR BYTES " ]"
+
 
 .SH DESCRIPTION
 NetEm is an enhancement of the Linux traffic control facilities
@@ -162,6 +171,27 @@ granularity avoid a perfect shaping at a specific level. This will show up in
 an artificial packet compression (bursts). Another influence factor are network
 adapter buffers which can also add artificial delay.
 
+.SS slot
+defer delivering accumulated packets to within a slot, with each available slot
+configured with a minimum delay to acquire, and an optional maximum delay.  Slot
+delays can be specified in nanoseconds, microseconds, milliseconds or seconds
+(e.g. 800us). Values for the optional parameters
+.I BYTES
+will limit the number of bytes delivered per slot, and/or
+.I PACKETS
+will limit the number of packets delivered per slot.
+
+These slot options can provide a crude approximation of bursty MACs such as
+DOCSIS, WiFi, and LTE.
+
+Note that slotting is limited by several factors: the kernel clock granularity,
+as with a rate, and attempts to deliver many packets within a slot will be
+smeared by the timer resolution, and by the underlying native bandwidth also.
+
+It is possible to combine slotting with a rate, in which case complex behaviors
+where either the rate, or the slot limits on bytes or packets per slot, govern
+the actual delivered rate.
+
 .SH LIMITATIONS
 The main known limitation of Netem are related to timer granularity, since
 Linux is not a real-time operating system.
diff --git a/tc/q_netem.c b/tc/q_netem.c
index 9f9a9b3df255..53a7a1056f5d 100644
--- a/tc/q_netem.c
+++ b/tc/q_netem.c
@@ -40,7 +40,10 @@ static void explain(void)
 "                 [ loss gemodel PERCENT [R [1-H [1-K]]]\n" \
 "                 [ ecn ]\n" \
 "                 [ reorder PRECENT [CORRELATION] [ gap DISTANCE ]]\n" \
-"                 [ rate RATE [PACKETOVERHEAD] [CELLSIZE] [CELLOVERHEAD]]\n");
+"                 [ rate RATE [PACKETOVERHEAD] [CELLSIZE] [CELLOVERHEAD]]\n" \
+"                 [ slot MIN_DELAY [MAX_DELAY] [packets MAX_PACKETS]" \
+" [bytes MAX_BYTES]]\n" \
+		);
 }
 
 static void explain1(const char *arg)
@@ -164,6 +167,7 @@ static int netem_parse_opt(struct qdisc_util *qu, int argc, char **argv,
 	struct tc_netem_gimodel gimodel;
 	struct tc_netem_gemodel gemodel;
 	struct tc_netem_rate rate = {};
+	struct tc_netem_slot slot = {};
 	__s16 *dist_data = NULL;
 	__u16 loss_type = NETEM_LOSS_UNSPEC;
 	int present[__TCA_NETEM_MAX] = {};
@@ -412,6 +416,45 @@ static int netem_parse_opt(struct qdisc_util *qu, int argc, char **argv,
 					return -1;
 				}
 			}
+		} else if (matches(*argv, "slot") == 0) {
+			NEXT_ARG();
+			present[TCA_NETEM_SLOT] = 1;
+			if (get_time64(&slot.min_delay, *argv)) {
+				explain1("slot min_delay");
+				return -1;
+			}
+			if (NEXT_IS_NUMBER()) {
+				NEXT_ARG();
+				if (get_time64(&slot.max_delay, *argv) ||
+				    slot.max_delay < slot.min_delay) {
+					explain1("slot max_delay");
+					return -1;
+				}
+			} else {
+				slot.max_delay = slot.min_delay;
+			}
+			if (NEXT_ARG_OK() &&
+			    matches(*(argv+1), "packets") == 0) {
+				NEXT_ARG();
+				if (!NEXT_ARG_OK() ||
+				    get_s32(&slot.max_packets, *(argv+1), 0)) {
+					explain1("slot packets");
+					return -1;
+				}
+				NEXT_ARG();
+			}
+			if (NEXT_ARG_OK() &&
+			    matches(*(argv+1), "bytes") == 0) {
+				unsigned int max_bytes;
+				NEXT_ARG();
+				if (!NEXT_ARG_OK() ||
+				    get_size(&max_bytes, *(argv+1))) {
+					explain1("slot bytes");
+					return -1;
+				}
+				slot.max_bytes = (int) max_bytes;
+				NEXT_ARG();
+			}
 		} else if (strcmp(*argv, "help") == 0) {
 			explain();
 			return -1;
@@ -472,6 +515,10 @@ static int netem_parse_opt(struct qdisc_util *qu, int argc, char **argv,
 	    addattr_l(n, 1024, TCA_NETEM_CORRUPT, &corrupt, sizeof(corrupt)) < 0)
 		return -1;
 
+	if (present[TCA_NETEM_SLOT] &&
+	    addattr_l(n, 1024, TCA_NETEM_SLOT, &slot, sizeof(slot)) < 0)
+		return -1;
+
 	if (loss_type != NETEM_LOSS_UNSPEC) {
 		struct rtattr *start;
 
@@ -526,6 +573,7 @@ static int netem_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
 	int *ecn = NULL;
 	struct tc_netem_qopt qopt;
 	const struct tc_netem_rate *rate = NULL;
+	const struct tc_netem_slot *slot = NULL;
 	int len;
 	__u64 rate64 = 0;
 
@@ -586,6 +634,11 @@ static int netem_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
 				return -1;
 			rate64 = rta_getattr_u64(tb[TCA_NETEM_RATE64]);
 		}
+		if (tb[TCA_NETEM_SLOT]) {
+			if (RTA_PAYLOAD(tb[TCA_NETEM_SLOT]) < sizeof(*slot))
+				return -1;
+		        slot = RTA_DATA(tb[TCA_NETEM_SLOT]);
+		}
 	}
 
 	fprintf(f, "limit %d", qopt.limit);
@@ -659,6 +712,15 @@ static int netem_print_opt(struct qdisc_util *qu, FILE *f, struct rtattr *opt)
 			fprintf(f, " celloverhead %d", rate->cell_overhead);
 	}
 
+	if (slot) {
+		fprintf(f, " slot %s", sprint_time64(slot->min_delay, b1));
+		fprintf(f, " %s", sprint_time64(slot->max_delay, b1));
+		if(slot->max_packets)
+			fprintf(f, " packets %d", slot->max_packets);
+		if(slot->max_bytes)
+			fprintf(f, " bytes %d", slot->max_bytes);
+	}
+
 	if (ecn)
 		fprintf(f, " ecn ");
 
-- 
2.19.0.rc0.228.g281dcd1b4d0-goog

^ permalink raw reply related

* [PATCH v2 iproute2-next 1/3] tc: support conversions to or from 64 bit nanosecond-based time
From: Yousuk Seung @ 2018-08-27  2:42 UTC (permalink / raw)
  To: netdev
  Cc: Stephen Hemminger, David Ahern, Michael McLennan, Priyaranjan Jha,
	Dave Taht, Yousuk Seung, Neal Cardwell
In-Reply-To: <20180827024230.246445-1-ysseung@google.com>

From: Dave Taht <dave.taht@gmail.com>

Using a 32 bit field to represent time in nanoseconds results in a
maximum value of about 4.3 seconds, which is well below many observed
delays in WiFi and LTE, and barely in the ballpark for a trip past the
Earth's moon, Luna.

Using 64 bit time fields in nanoseconds allows us to simulate
network diameters of several hundred light-years. However, only
conversions to and from ns, us, ms, and seconds are provided.

The iproute2 64 bit api uses signed values for time. Being able to
represent positive or negative time allows us to calculate +/- deltas
between, for example, the CLOCK_TAI and CLOCK_REALTIME clocks.

Time related utility functions in tc_util.c are moved to lib/utils.c.

Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
---
 include/utils.h   |  12 ++++++
 lib/utils.c       | 104 ++++++++++++++++++++++++++++++++++++++++++++++
 tc/tc_cbq.c       |   1 +
 tc/tc_core.c      |   1 +
 tc/tc_core.h      |   2 -
 tc/tc_estimator.c |   1 +
 tc/tc_util.c      |  46 --------------------
 tc/tc_util.h      |   3 --
 8 files changed, 119 insertions(+), 51 deletions(-)

diff --git a/include/utils.h b/include/utils.h
index 8cb4349e8a89..eba67b6ecf44 100644
--- a/include/utils.h
+++ b/include/utils.h
@@ -46,6 +46,11 @@ void incomplete_command(void) __attribute__((noreturn));
 #define NEXT_ARG_FWD() do { argv++; argc--; } while(0)
 #define PREV_ARG() do { argv--; argc++; } while(0)
 
+#define TIME_UNITS_PER_SEC	1000000
+#define NSEC_PER_USEC 1000
+#define NSEC_PER_MSEC 1000000
+#define NSEC_PER_SEC 1000000000LL
+
 typedef struct
 {
 	__u16 flags;
@@ -310,4 +315,11 @@ size_t strlcat(char *dst, const char *src, size_t size);
 
 void drop_cap(void);
 
+int get_time(unsigned int *time, const char *str);
+int get_time64(__s64 *time, const char *str);
+void print_time(char *buf, int len, __u32 time);
+void print_time64(char *buf, int len, __s64 time);
+char *sprint_time(__u32 time, char *buf);
+char *sprint_time64(__s64 time, char *buf);
+
 #endif /* __UTILS_H__ */
diff --git a/lib/utils.c b/lib/utils.c
index 02ce67721915..34ec4ab12646 100644
--- a/lib/utils.c
+++ b/lib/utils.c
@@ -1633,3 +1633,107 @@ void drop_cap(void)
 	}
 #endif
 }
+
+int get_time(unsigned int *time, const char *str)
+{
+	double t;
+	char *p;
+
+	t = strtod(str, &p);
+	if (p == str)
+		return -1;
+
+	if (*p) {
+		if (strcasecmp(p, "s") == 0 || strcasecmp(p, "sec") == 0 ||
+		    strcasecmp(p, "secs") == 0)
+			t *= TIME_UNITS_PER_SEC;
+		else if (strcasecmp(p, "ms") == 0 || strcasecmp(p, "msec") == 0 ||
+			 strcasecmp(p, "msecs") == 0)
+			t *= TIME_UNITS_PER_SEC/1000;
+		else if (strcasecmp(p, "us") == 0 || strcasecmp(p, "usec") == 0 ||
+			 strcasecmp(p, "usecs") == 0)
+			t *= TIME_UNITS_PER_SEC/1000000;
+		else
+			return -1;
+	}
+
+	*time = t;
+	return 0;
+}
+
+
+void print_time(char *buf, int len, __u32 time)
+{
+	double tmp = time;
+
+	if (tmp >= TIME_UNITS_PER_SEC)
+		snprintf(buf, len, "%.1fs", tmp/TIME_UNITS_PER_SEC);
+	else if (tmp >= TIME_UNITS_PER_SEC/1000)
+		snprintf(buf, len, "%.1fms", tmp/(TIME_UNITS_PER_SEC/1000));
+	else
+		snprintf(buf, len, "%uus", time);
+}
+
+char *sprint_time(__u32 time, char *buf)
+{
+	print_time(buf, SPRINT_BSIZE-1, time);
+	return buf;
+}
+
+/* 64 bit times are represented internally in nanoseconds */
+int get_time64(__s64 *time, const char *str)
+{
+	double nsec;
+	char *p;
+
+	nsec = strtod(str, &p);
+	if (p == str)
+		return -1;
+
+	if (*p) {
+		if (strcasecmp(p, "s") == 0 ||
+		    strcasecmp(p, "sec") == 0 ||
+		    strcasecmp(p, "secs") == 0)
+			nsec *= NSEC_PER_SEC;
+		else if (strcasecmp(p, "ms") == 0 ||
+			 strcasecmp(p, "msec") == 0 ||
+			 strcasecmp(p, "msecs") == 0)
+			nsec *= NSEC_PER_MSEC;
+		else if (strcasecmp(p, "us") == 0 ||
+			 strcasecmp(p, "usec") == 0 ||
+			 strcasecmp(p, "usecs") == 0)
+			nsec *= NSEC_PER_USEC;
+		else if (strcasecmp(p, "ns") == 0 ||
+			 strcasecmp(p, "nsec") == 0 ||
+			 strcasecmp(p, "nsecs") == 0)
+			nsec *= 1;
+		else
+			return -1;
+	}
+
+	*time = nsec;
+	return 0;
+}
+
+void print_time64(char *buf, int len, __s64 time)
+{
+	double nsec = time;
+
+	if (time >= NSEC_PER_SEC)
+		snprintf(buf, len, "%.3fs", nsec/NSEC_PER_SEC);
+	else if (time >= NSEC_PER_MSEC)
+		snprintf(buf, len, "%.3fms", nsec/NSEC_PER_MSEC);
+	else if (time >= NSEC_PER_USEC)
+		snprintf(buf, len, "%.3fus", nsec/NSEC_PER_USEC);
+	else
+		snprintf(buf, len, "%lldns", time);
+}
+
+char *sprint_time64(__s64 time, char *buf)
+{
+	print_time64(buf, SPRINT_BSIZE-1, time);
+	return buf;
+}
+
+
+
diff --git a/tc/tc_cbq.c b/tc/tc_cbq.c
index 4cd584a91a26..c811456b1627 100644
--- a/tc/tc_cbq.c
+++ b/tc/tc_cbq.c
@@ -20,6 +20,7 @@
 #include <arpa/inet.h>
 #include <string.h>
 
+#include "utils.h"
 #include "tc_core.h"
 #include "tc_cbq.h"
 
diff --git a/tc/tc_core.c b/tc/tc_core.c
index 1bde4d51e5dc..8eb11223eb9d 100644
--- a/tc/tc_core.c
+++ b/tc/tc_core.c
@@ -21,6 +21,7 @@
 #include <arpa/inet.h>
 #include <string.h>
 
+#include "utils.h"
 #include "tc_core.h"
 #include <linux/atm.h>
 
diff --git a/tc/tc_core.h b/tc/tc_core.h
index 1dfa9a4f773b..bd4a99f0d8dd 100644
--- a/tc/tc_core.h
+++ b/tc/tc_core.h
@@ -5,8 +5,6 @@
 #include <asm/types.h>
 #include <linux/pkt_sched.h>
 
-#define TIME_UNITS_PER_SEC	1000000
-
 enum link_layer {
 	LINKLAYER_UNSPEC,
 	LINKLAYER_ETHERNET,
diff --git a/tc/tc_estimator.c b/tc/tc_estimator.c
index e4edfc7e98d9..f494b7caa44e 100644
--- a/tc/tc_estimator.c
+++ b/tc/tc_estimator.c
@@ -20,6 +20,7 @@
 #include <arpa/inet.h>
 #include <string.h>
 
+#include "utils.h"
 #include "tc_core.h"
 
 int tc_setup_estimator(unsigned int A, unsigned int time_const, struct tc_estimator *est)
diff --git a/tc/tc_util.c b/tc/tc_util.c
index d7578528a31b..cafbe49f3ec8 100644
--- a/tc/tc_util.c
+++ b/tc/tc_util.c
@@ -334,52 +334,6 @@ char *sprint_rate(__u64 rate, char *buf)
 	return buf;
 }
 
-int get_time(unsigned int *time, const char *str)
-{
-	double t;
-	char *p;
-
-	t = strtod(str, &p);
-	if (p == str)
-		return -1;
-
-	if (*p) {
-		if (strcasecmp(p, "s") == 0 || strcasecmp(p, "sec") == 0 ||
-		    strcasecmp(p, "secs") == 0)
-			t *= TIME_UNITS_PER_SEC;
-		else if (strcasecmp(p, "ms") == 0 || strcasecmp(p, "msec") == 0 ||
-			 strcasecmp(p, "msecs") == 0)
-			t *= TIME_UNITS_PER_SEC/1000;
-		else if (strcasecmp(p, "us") == 0 || strcasecmp(p, "usec") == 0 ||
-			 strcasecmp(p, "usecs") == 0)
-			t *= TIME_UNITS_PER_SEC/1000000;
-		else
-			return -1;
-	}
-
-	*time = t;
-	return 0;
-}
-
-
-void print_time(char *buf, int len, __u32 time)
-{
-	double tmp = time;
-
-	if (tmp >= TIME_UNITS_PER_SEC)
-		snprintf(buf, len, "%.1fs", tmp/TIME_UNITS_PER_SEC);
-	else if (tmp >= TIME_UNITS_PER_SEC/1000)
-		snprintf(buf, len, "%.1fms", tmp/(TIME_UNITS_PER_SEC/1000));
-	else
-		snprintf(buf, len, "%uus", time);
-}
-
-char *sprint_time(__u32 time, char *buf)
-{
-	print_time(buf, SPRINT_BSIZE-1, time);
-	return buf;
-}
-
 char *sprint_ticks(__u32 ticks, char *buf)
 {
 	return sprint_time(tc_core_tick2time(ticks), buf);
diff --git a/tc/tc_util.h b/tc/tc_util.h
index 6632c4f9c528..76fd986d6e4c 100644
--- a/tc/tc_util.h
+++ b/tc/tc_util.h
@@ -81,13 +81,11 @@ int get_rate64(__u64 *rate, const char *str);
 int get_percent_rate64(__u64 *rate, const char *str, const char *dev);
 int get_size(unsigned int *size, const char *str);
 int get_size_and_cell(unsigned int *size, int *cell_log, char *str);
-int get_time(unsigned int *time, const char *str);
 int get_linklayer(unsigned int *val, const char *arg);
 
 void print_rate(char *buf, int len, __u64 rate);
 void print_size(char *buf, int len, __u32 size);
 void print_qdisc_handle(char *buf, int len, __u32 h);
-void print_time(char *buf, int len, __u32 time);
 void print_linklayer(char *buf, int len, unsigned int linklayer);
 void print_devname(enum output_type type, int ifindex);
 
@@ -95,7 +93,6 @@ char *sprint_rate(__u64 rate, char *buf);
 char *sprint_size(__u32 size, char *buf);
 char *sprint_qdisc_handle(__u32 h, char *buf);
 char *sprint_tc_classid(__u32 h, char *buf);
-char *sprint_time(__u32 time, char *buf);
 char *sprint_ticks(__u32 ticks, char *buf);
 char *sprint_linklayer(unsigned int linklayer, char *buf);
 
-- 
2.19.0.rc0.228.g281dcd1b4d0-goog

^ permalink raw reply related

* [PATCH v2 iproute2-next 0/3] support delivering packets in
From: Yousuk Seung @ 2018-08-27  2:42 UTC (permalink / raw)
  To: netdev
  Cc: Stephen Hemminger, David Ahern, Michael McLennan, Priyaranjan Jha,
	Yousuk Seung

This series adds support for the new "slot" netem parameter for
slotting. Slotting is an approximation of shared media that gather up
packets within a varying delay window before delivering them nearly at
once.

Dave Taht (2):
  tc: support conversions to or from 64 bit nanosecond-based time
  q_netem: support delivering packets in delayed time slots

Yousuk Seung (1):
  q_netem: slotting with non-uniform distribution

 include/utils.h     |  12 +++++
 lib/utils.c         | 104 +++++++++++++++++++++++++++++++++++++++
 man/man8/tc-netem.8 |  40 ++++++++++++++-
 tc/q_netem.c        | 115 +++++++++++++++++++++++++++++++++++++++++++-
 tc/tc_cbq.c         |   1 +
 tc/tc_core.c        |   1 +
 tc/tc_core.h        |   2 -
 tc/tc_estimator.c   |   1 +
 tc/tc_util.c        |  46 ------------------
 tc/tc_util.h        |   3 --
 10 files changed, 272 insertions(+), 53 deletions(-)

-- 
2.19.0.rc0.228.g281dcd1b4d0-goog

^ permalink raw reply

* [PATCH RFC net-next] net/fib: Poptrie based FIB lookup
From: Md. Islam @ 2018-08-27  2:28 UTC (permalink / raw)
  To: Netdev, David Miller, David Ahern, Eric Dumazet, Alexey Kuznetsov,
	Stephen Hemminger, makita.toshiaki, panda, yasuhiro.ohara,
	john.fastabend, alexei.starovoitov

This patch implements Poptrie [1] based FIB lookup. It exhibits pretty
impressive lookup performance compared to LC-trie. This poptrie
implementation however somewhat deviates from the original
implementation [2]. I tested this patch very rigorously with several
FIB tables containing half a million routes. I got same result as
LC-trie based fib_lookup().

Poptrie is intended to work in conjunction with LC-trie (not replace
it). It is primarily designed to overcome many issues of TCAM based
router [1]. It [1] shows that the Poptrie can achieve very impressive
lookup performance on CPU. This patch will mainly be used by XDP
forwarding.

1. Asai, Hirochika, and Yasuhiro Ohara. "Poptrie: A compressed trie
with population count for fast and scalable software IP routing table
lookup." ACM SIGCOMM Computer Communication Review. 2015.

2. https://github.com/pixos/poptrie

>From c5e05ea66b06eb9313749bc8969b4c2798fcf96a Mon Sep 17 00:00:00 2001
From: tamimcse <tamim@csebuet.org>
Date: Sun, 26 Aug 2018 21:12:38 -0400
Subject: [PATCH] Implented Poptrie

Signed-off-by: tamimcse <tamim@csebuet.org>
---
 include/net/ip_fib.h   |  40 +++++++
 net/ipv4/Makefile      |   2 +-
 net/ipv4/fib_poptrie.c | 295 +++++++++++++++++++++++++++++++++++++++++++++++++
 net/ipv4/fib_trie.c    |   3 +
 4 files changed, 339 insertions(+), 1 deletion(-)
 create mode 100644 net/ipv4/fib_poptrie.c

diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 81d0f21..c4374a1 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -197,6 +197,37 @@ struct fib_entry_notifier_info {
     u32 tb_id;
 };

+/*Maximum number of next-hop*/
+#define NEXT_HOP_MAX 255
+
+struct next_hops {
+    struct net_device    *netdev_arr[NEXT_HOP_MAX];
+    /*Total number of next-hops*/
+    u8 count;
+};
+
+struct poptrie_node {
+    u64 vector;
+    u64 leafvec;
+    u64 nodevec;
+    struct poptrie_node *chield_nodes;
+    u8 *leaves;
+    u8 *prefixes;
+};
+
+struct poptrie {
+    char    def_nh;
+    struct next_hops    nhs;
+    struct poptrie_node *root;
+    spinlock_t            lock;
+};
+
+void poptrie_insert(struct poptrie *pt, u32 key,
+        u8 prefix_len, struct net_device *dev);
+void poptrie_lookup(struct poptrie *pt, __be32 dest,
+        struct net_device **dev);
+
+
 struct fib_nh_notifier_info {
     struct fib_notifier_info info; /* must be first */
     struct fib_nh *fib_nh;
@@ -219,6 +250,7 @@ struct fib_table {
     int            tb_num_default;
     struct rcu_head        rcu;
     unsigned long         *tb_data;
+    struct poptrie    pt;
     unsigned long        __data[0];
 };

@@ -268,6 +300,14 @@ static inline int fib_lookup(struct net *net,
const struct flowi4 *flp,
     rcu_read_lock();

     tb = fib_get_table(net, RT_TABLE_MAIN);
+
+    /*Testing poptrie_lookup*/
+    if (tb && tb->pt.root) {
+        struct net_device *dev;
+
+        poptrie_lookup(&tb->pt, flp->daddr, &dev);
+    }
+
     if (tb)
         err = fib_table_lookup(tb, flp, res, flags | FIB_LOOKUP_NOREF);

diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile
index b379520..b1246d2 100644
--- a/net/ipv4/Makefile
+++ b/net/ipv4/Makefile
@@ -14,7 +14,7 @@ obj-y     := route.o inetpeer.o protocol.o \
          udp_offload.o arp.o icmp.o devinet.o af_inet.o igmp.o \
          fib_frontend.o fib_semantics.o fib_trie.o fib_notifier.o \
          inet_fragment.o ping.o ip_tunnel_core.o gre_offload.o \
-         metrics.o
+         metrics.o fib_poptrie.o

 obj-$(CONFIG_NET_IP_TUNNEL) += ip_tunnel.o
 obj-$(CONFIG_SYSCTL) += sysctl_net_ipv4.o
diff --git a/net/ipv4/fib_poptrie.c b/net/ipv4/fib_poptrie.c
new file mode 100644
index 0000000..b3a88ab
--- /dev/null
+++ b/net/ipv4/fib_poptrie.c
@@ -0,0 +1,295 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *This program is free software; you can redistribute it and/or
+ *   modify it under the terms of the GNU General Public License
+ *   as published by the Free Software Foundation; either version
+ *   2 of the License, or (at your option) any later version.
+ *
+ * Author: MD Iftakharul Islam (Tamim) <mislam4@kent.edu>.
+ *
+ * Asai, Hirochika, and Yasuhiro Ohara. "Poptrie: A compressed trie
+ * with population count for fast and scalable software IP routing
+ * table lookup." ACM SIGCOMM Computer Communication Review. 2015.
+ *
+ */
+
+#include <net/ip_fib.h>
+
+/*Get next-hop index from next-hop*/
+static u8 get_fib_index(struct next_hops *nhs, struct net_device *dev)
+{
+    u8 i;
+
+    for (i = 0; i < nhs->count; i++) {
+        if (nhs->netdev_arr[i] == dev)
+            return i;
+    }
+    nhs->netdev_arr[nhs->count++] = dev;
+    return nhs->count - 1;
+}
+
+/*Converts next-hop index into actual next-hop*/
+static struct net_device *get_fib(struct next_hops *nhs, u8 fib_index)
+{
+    return nhs->netdev_arr[fib_index];
+}
+
+/*Extracts 6 bytes from key starting from offset*/
+static inline u32 extract(u32 key, int offset)
+{
+    if (likely(offset < 26))
+        return (key >> (26 - offset)) & 63;
+    else
+        return (key << 4) & 63;
+}
+
+/*Set FIB index and prefix length to a leaf*/
+static void set_fib_index(struct poptrie_node *node,
+        unsigned long leaf_index, char fib_index, char prefix_len)
+{
+    node->leaves[leaf_index] = fib_index;
+    node->prefixes[leaf_index] = prefix_len;
+}
+
+/*Insert a leaf at index*/
+static bool insert_leaf(struct poptrie_node *node,
+        char index, char fib_index, char prefix_len)
+{
+    int i, j;
+    char *leaves;
+    char *prefixes;
+    int size = (int)hweight64(node->leafvec);
+
+    if (index > size) {
+        pr_err("Index needs to be smaller or equal to size");
+        return false;
+    }
+
+    leaves = kcalloc(size + 1, sizeof(*leaves), GFP_ATOMIC);
+    prefixes = kcalloc(size + 1, sizeof(*prefixes), GFP_ATOMIC);
+
+    for (i = 0, j = 0; i < (size + 1); i++) {
+        if (i == index) {
+            leaves[i] = fib_index;
+            prefixes[i] = prefix_len;
+        } else {
+            leaves[i] = node->leaves[j];
+            prefixes[i] = node->prefixes[j];
+            j++;
+        }
+    }
+
+    kfree(node->leaves);
+    kfree(node->prefixes);
+    node->leaves = leaves;
+    node->prefixes = prefixes;
+    return true;
+}
+
+/*Insert a new node at index*/
+static void insert_chield_node(struct poptrie_node *node,
+        char index)
+{
+    int i, j;
+    struct poptrie_node *arr;
+    int arr_size  = (int)hweight64(node->nodevec);
+
+    arr = kcalloc(arr_size + 1, sizeof(*arr), GFP_ATOMIC);
+    for (i = 0, j = 0; i < (arr_size + 1); i++) {
+        if (i != index && j < arr_size)
+            arr[i] = node->chield_nodes[j++];
+    }
+
+    kfree(node->chield_nodes);
+    node->chield_nodes = arr;
+}
+
+void poptrie_insert(struct poptrie *pt, u32 key,
+        u8 prefix_len, struct net_device *dev)
+{
+    int offset, i;
+    u32 index;
+    u8 consecutive_leafs;
+    u64 bitmap;
+    u64 bitmap_hp;
+    int arr_size;
+    unsigned long chield_index;
+    unsigned long leaf_index, prev_leaf_index;
+    unsigned long index_hp;
+    struct poptrie_node *node;
+    u8 prev_fib_index, prev_prefix_len;
+    u8 fib_index = get_fib_index(&pt->nhs, dev);
+
+    spin_lock(&pt->lock);
+
+    if (!pt->root)
+        pt->root = kzalloc(sizeof(*pt->root), GFP_ATOMIC);
+
+    /* Default route */
+    if (prefix_len == 0) {
+        pt->def_nh = fib_index;
+        goto finish;
+    }
+
+    /*Iterate through the nodes*/
+    offset = 0;
+    node = pt->root;
+    while (prefix_len > (offset + 6)) {
+        index = extract(key, offset);
+        bitmap = 1ULL << index;
+        chield_index = hweight64(node->nodevec & (bitmap - 1));
+
+        /*No node for this index, so need to insert a node*/
+        if (!(node->nodevec & bitmap)) {
+            insert_chield_node(node, chield_index);
+            node->nodevec |= bitmap;
+        }
+        node = &node->chield_nodes[chield_index];
+        offset += 6;
+    }
+
+    /*Now need to insert a leaf*/
+
+    index = extract(key, offset);
+    bitmap = 1ULL << index;
+    consecutive_leafs = 1 << (offset + 6 - prefix_len);
+
+    if (node->vector & bitmap && node->leafvec & bitmap) {
+        /*A leaf already exist for this index, so update the existing leaf*/
+        leaf_index = hweight64(node->leafvec & (bitmap - 1));
+        arr_size = (int)hweight64(node->leafvec);
+        if (leaf_index >= arr_size)
+            goto error;
+        /*Ignore the prefix*/
+        if (node->prefixes[leaf_index] > prefix_len) {
+            goto finish;
+        } else if (node->prefixes[leaf_index] == prefix_len) {
+            set_fib_index(node, leaf_index, fib_index, prefix_len);
+        } else {
+            /*hole punching*/
+            bitmap_hp = bitmap << consecutive_leafs;
+            if (!(node->leafvec & bitmap_hp)) {
+                index_hp = hweight64(node->leafvec & (bitmap_hp - 1)) - 1;
+                if (node->prefixes[index_hp] <= prefix_len) {
+                    insert_leaf(node, index_hp, fib_index, prefix_len);
+                    node->leafvec |= bitmap_hp;
+                }
+
+                for (i = leaf_index; i < index_hp ; i++) {
+                    if (node->prefixes[i] <= prefix_len)
+                        set_fib_index(node, i, fib_index, prefix_len);
+                }
+            } else {
+                index_hp = hweight64(node->leafvec & (bitmap_hp - 1)) - 1;
+                for (i = leaf_index; i <= index_hp ; i++) {
+                    if (node->prefixes[i] <= prefix_len)
+                        set_fib_index(node, i, fib_index, prefix_len);
+                }
+            }
+        }
+    } else if (!(node->vector & bitmap)) {
+        /*No leaf for this index, so need to insert a leaf*/
+        leaf_index = hweight64(node->leafvec & (bitmap - 1));
+        insert_leaf(node, leaf_index, fib_index, prefix_len);
+        node->leafvec |= bitmap;
+    } else if (node->vector & bitmap && !(node->leafvec & bitmap)) {
+        /*There is a leaf for this index created by another
+         *  prefix with smaller length
+         */
+        prev_leaf_index = hweight64(node->leafvec & (bitmap - 1)) - 1;
+        arr_size = (int)hweight64(node->leafvec);
+        if (prev_leaf_index >= arr_size)
+            goto error;
+        if (node->prefixes[prev_leaf_index] <= prefix_len) {
+            insert_leaf(node, prev_leaf_index + 1, fib_index, prefix_len);
+            node->leafvec |= bitmap;
+        }
+
+        /*hole punching*/
+        prev_fib_index = node->leaves[prev_leaf_index];
+        prev_prefix_len = node->prefixes[prev_leaf_index];
+
+        bitmap_hp = bitmap << consecutive_leafs;
+        if (!(node->leafvec & bitmap_hp)) {
+            index_hp = hweight64(node->leafvec & (bitmap_hp - 1)) - 1;
+            if (node->prefixes[index_hp] <= prefix_len) {
+                if (prev_leaf_index < 0)
+                    goto error;
+                insert_leaf(node, index_hp + 1,
+                        prev_fib_index, prev_prefix_len);
+                node->leafvec |= bitmap_hp;
+            }
+        }
+
+        for (i = 2; i < consecutive_leafs; i++) {
+            bitmap_hp = bitmap << (i - 1);
+            if (node->leafvec & bitmap_hp) {
+                index_hp = hweight64(node->leafvec & (bitmap_hp - 1)) - 1;
+                insert_leaf(node, index_hp + 1,
+                        fib_index, prefix_len);
+                node->leafvec |= bitmap_hp;
+            }
+        }
+    }
+
+    if (consecutive_leafs > 1)
+        node->vector |= ((1ULL << consecutive_leafs) - 1) << index;
+    else
+        node->vector |= bitmap;
+
+    goto finish;
+
+error:
+    pr_err("Something is very wrong !!!!");
+finish:
+    spin_unlock(&pt->lock);
+}
+
+/*We assume that pt->root is not NULL*/
+void poptrie_lookup(struct poptrie *pt, __be32 dest, struct net_device **dev)
+{
+    register u32 index;
+    register u64 bitmap, bitmask;
+    register unsigned long leaf_index;
+    register unsigned long node_index;
+    register struct poptrie_node *node = pt->root;
+    register u8 fib_index = pt->def_nh;
+    register u8 carry = 0;
+    register u8 carry_bit = 2;
+
+    while (1) {
+        /*Extract 6 bytes from dest */
+        if (likely(carry_bit != 8)) {
+            index = ((dest & 252) >> carry_bit) | carry;
+            carry = (dest & ((1 << carry_bit) - 1)) << (6 - carry_bit);
+            carry_bit = carry_bit + 2;
+            dest = dest >> 8;
+        } else {
+            index = carry;
+            carry = 0;
+            carry_bit = 2;
+        }
+
+        /*Create a bitmap based on the the extracted value*/
+        bitmap = 1ULL << index;
+        bitmask = bitmap - 1;
+
+        /*Find corresponding leaf*/
+        if (likely(node->vector & bitmap)) {
+            leaf_index = hweight64(node->leafvec & bitmask);
+            if (!(node->leafvec & bitmap))
+                leaf_index--;
+            fib_index = node->leaves[leaf_index];
+        }
+
+        /*Find corresponding node*/
+        if (likely(node->nodevec & bitmap)) {
+            node_index = hweight64(node->nodevec & bitmask);
+            node = &node->chield_nodes[node_index];
+            continue;
+        }
+
+        *dev = get_fib(&pt->nhs, fib_index);
+        return;
+    }
+}
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 3dcffd3..0509a24 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1280,6 +1280,9 @@ int fib_table_insert(struct net *net, struct
fib_table *tb,
     if (err)
         goto out_fib_notif;

+    /*This should be done when Poptrie is enabled from CONFIG*/
+    poptrie_insert(&tb->pt, key, plen, fi->fib_dev);
+
     if (!plen)
         tb->tb_num_default++;

-- 
2.7.4

^ permalink raw reply related

* [PATCH RFC net-next] net: Poptrie based FIB lookup
From: Md. Islam @ 2018-08-27  2:13 UTC (permalink / raw)
  To: Netdev, David Miller, David Ahern, Eric Dumazet, Alexey Kuznetsov,
	Stephen Hemminger, makita.toshiaki, panda, yasuhiro.ohara,
	Jesper Dangaard Brouer

This patch implements Poptrie [1] based FIB lookup. It exhibits pretty
impressive lookup performance compared to LC-trie. This poptrie
implementation however somewhat deviates from the original
implementation [2]. I tested this patch very rigorously with several
FIB tables containing half a million routes. I got same result as
LC-trie based fib_lookup().

Poptrie is intended to work in conjunction with LC-trie (not replace
it). It is primarily designed to overcome many issues of TCAM based
router [1]. [1] shows that the Poptrie can achieve very impressive
lookup performance on CPU. This patch will mainly be used by XDP
forwarding.

1. Asai, Hirochika, and Yasuhiro Ohara. "Poptrie: A compressed trie
with population count for fast and scalable software IP routing table
lookup." ACM SIGCOMM Computer Communication Review. 2015.

2. https://github.com/pixos/poptrie

>From c5e05ea66b06eb9313749bc8969b4c2798fcf96a Mon Sep 17 00:00:00 2001
From: tamimcse <tamim@csebuet.org>
Date: Sun, 26 Aug 2018 21:12:38 -0400
Subject: [PATCH] Implented Poptrie

Signed-off-by: tamimcse <tamim@csebuet.org>
---
 include/net/ip_fib.h   |  40 +++++++
 net/ipv4/Makefile      |   2 +-
 net/ipv4/fib_poptrie.c | 295 +++++++++++++++++++++++++++++++++++++++++++++++++
 net/ipv4/fib_trie.c    |   3 +
 4 files changed, 339 insertions(+), 1 deletion(-)
 create mode 100644 net/ipv4/fib_poptrie.c

diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 81d0f21..c4374a1 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -197,6 +197,37 @@ struct fib_entry_notifier_info {
     u32 tb_id;
 };

+/*Maximum number of next-hop*/
+#define NEXT_HOP_MAX 255
+
+struct next_hops {
+    struct net_device    *netdev_arr[NEXT_HOP_MAX];
+    /*Total number of next-hops*/
+    u8 count;
+};
+
+struct poptrie_node {
+    u64 vector;
+    u64 leafvec;
+    u64 nodevec;
+    struct poptrie_node *chield_nodes;
+    u8 *leaves;
+    u8 *prefixes;
+};
+
+struct poptrie {
+    char    def_nh;
+    struct next_hops    nhs;
+    struct poptrie_node *root;
+    spinlock_t            lock;
+};
+
+void poptrie_insert(struct poptrie *pt, u32 key,
+        u8 prefix_len, struct net_device *dev);
+void poptrie_lookup(struct poptrie *pt, __be32 dest,
+        struct net_device **dev);
+
+
 struct fib_nh_notifier_info {
     struct fib_notifier_info info; /* must be first */
     struct fib_nh *fib_nh;
@@ -219,6 +250,7 @@ struct fib_table {
     int            tb_num_default;
     struct rcu_head        rcu;
     unsigned long         *tb_data;
+    struct poptrie    pt;
     unsigned long        __data[0];
 };

@@ -268,6 +300,14 @@ static inline int fib_lookup(struct net *net,
const struct flowi4 *flp,
     rcu_read_lock();

     tb = fib_get_table(net, RT_TABLE_MAIN);
+
+    /*Testing poptrie_lookup*/
+    if (tb && tb->pt.root) {
+        struct net_device *dev;
+
+        poptrie_lookup(&tb->pt, flp->daddr, &dev);
+    }
+
     if (tb)
         err = fib_table_lookup(tb, flp, res, flags | FIB_LOOKUP_NOREF);

diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile
index b379520..b1246d2 100644
--- a/net/ipv4/Makefile
+++ b/net/ipv4/Makefile
@@ -14,7 +14,7 @@ obj-y     := route.o inetpeer.o protocol.o \
          udp_offload.o arp.o icmp.o devinet.o af_inet.o igmp.o \
          fib_frontend.o fib_semantics.o fib_trie.o fib_notifier.o \
          inet_fragment.o ping.o ip_tunnel_core.o gre_offload.o \
-         metrics.o
+         metrics.o fib_poptrie.o

 obj-$(CONFIG_NET_IP_TUNNEL) += ip_tunnel.o
 obj-$(CONFIG_SYSCTL) += sysctl_net_ipv4.o
diff --git a/net/ipv4/fib_poptrie.c b/net/ipv4/fib_poptrie.c
new file mode 100644
index 0000000..b3a88ab
--- /dev/null
+++ b/net/ipv4/fib_poptrie.c
@@ -0,0 +1,295 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *This program is free software; you can redistribute it and/or
+ *   modify it under the terms of the GNU General Public License
+ *   as published by the Free Software Foundation; either version
+ *   2 of the License, or (at your option) any later version.
+ *
+ * Author: MD Iftakharul Islam (Tamim) <mislam4@kent.edu>.
+ *
+ * Asai, Hirochika, and Yasuhiro Ohara. "Poptrie: A compressed trie
+ * with population count for fast and scalable software IP routing
+ * table lookup." ACM SIGCOMM Computer Communication Review. 2015.
+ *
+ */
+
+#include <net/ip_fib.h>
+
+/*Get next-hop index from next-hop*/
+static u8 get_fib_index(struct next_hops *nhs, struct net_device *dev)
+{
+    u8 i;
+
+    for (i = 0; i < nhs->count; i++) {
+        if (nhs->netdev_arr[i] == dev)
+            return i;
+    }
+    nhs->netdev_arr[nhs->count++] = dev;
+    return nhs->count - 1;
+}
+
+/*Converts next-hop index into actual next-hop*/
+static struct net_device *get_fib(struct next_hops *nhs, u8 fib_index)
+{
+    return nhs->netdev_arr[fib_index];
+}
+
+/*Extracts 6 bytes from key starting from offset*/
+static inline u32 extract(u32 key, int offset)
+{
+    if (likely(offset < 26))
+        return (key >> (26 - offset)) & 63;
+    else
+        return (key << 4) & 63;
+}
+
+/*Set FIB index and prefix length to a leaf*/
+static void set_fib_index(struct poptrie_node *node,
+        unsigned long leaf_index, char fib_index, char prefix_len)
+{
+    node->leaves[leaf_index] = fib_index;
+    node->prefixes[leaf_index] = prefix_len;
+}
+
+/*Insert a leaf at index*/
+static bool insert_leaf(struct poptrie_node *node,
+        char index, char fib_index, char prefix_len)
+{
+    int i, j;
+    char *leaves;
+    char *prefixes;
+    int size = (int)hweight64(node->leafvec);
+
+    if (index > size) {
+        pr_err("Index needs to be smaller or equal to size");
+        return false;
+    }
+
+    leaves = kcalloc(size + 1, sizeof(*leaves), GFP_ATOMIC);
+    prefixes = kcalloc(size + 1, sizeof(*prefixes), GFP_ATOMIC);
+
+    for (i = 0, j = 0; i < (size + 1); i++) {
+        if (i == index) {
+            leaves[i] = fib_index;
+            prefixes[i] = prefix_len;
+        } else {
+            leaves[i] = node->leaves[j];
+            prefixes[i] = node->prefixes[j];
+            j++;
+        }
+    }
+
+    kfree(node->leaves);
+    kfree(node->prefixes);
+    node->leaves = leaves;
+    node->prefixes = prefixes;
+    return true;
+}
+
+/*Insert a new node at index*/
+static void insert_chield_node(struct poptrie_node *node,
+        char index)
+{
+    int i, j;
+    struct poptrie_node *arr;
+    int arr_size  = (int)hweight64(node->nodevec);
+
+    arr = kcalloc(arr_size + 1, sizeof(*arr), GFP_ATOMIC);
+    for (i = 0, j = 0; i < (arr_size + 1); i++) {
+        if (i != index && j < arr_size)
+            arr[i] = node->chield_nodes[j++];
+    }
+
+    kfree(node->chield_nodes);
+    node->chield_nodes = arr;
+}
+
+void poptrie_insert(struct poptrie *pt, u32 key,
+        u8 prefix_len, struct net_device *dev)
+{
+    int offset, i;
+    u32 index;
+    u8 consecutive_leafs;
+    u64 bitmap;
+    u64 bitmap_hp;
+    int arr_size;
+    unsigned long chield_index;
+    unsigned long leaf_index, prev_leaf_index;
+    unsigned long index_hp;
+    struct poptrie_node *node;
+    u8 prev_fib_index, prev_prefix_len;
+    u8 fib_index = get_fib_index(&pt->nhs, dev);
+
+    spin_lock(&pt->lock);
+
+    if (!pt->root)
+        pt->root = kzalloc(sizeof(*pt->root), GFP_ATOMIC);
+
+    /* Default route */
+    if (prefix_len == 0) {
+        pt->def_nh = fib_index;
+        goto finish;
+    }
+
+    /*Iterate through the nodes*/
+    offset = 0;
+    node = pt->root;
+    while (prefix_len > (offset + 6)) {
+        index = extract(key, offset);
+        bitmap = 1ULL << index;
+        chield_index = hweight64(node->nodevec & (bitmap - 1));
+
+        /*No node for this index, so need to insert a node*/
+        if (!(node->nodevec & bitmap)) {
+            insert_chield_node(node, chield_index);
+            node->nodevec |= bitmap;
+        }
+        node = &node->chield_nodes[chield_index];
+        offset += 6;
+    }
+
+    /*Now need to insert a leaf*/
+
+    index = extract(key, offset);
+    bitmap = 1ULL << index;
+    consecutive_leafs = 1 << (offset + 6 - prefix_len);
+
+    if (node->vector & bitmap && node->leafvec & bitmap) {
+        /*A leaf already exist for this index, so update the existing leaf*/
+        leaf_index = hweight64(node->leafvec & (bitmap - 1));
+        arr_size = (int)hweight64(node->leafvec);
+        if (leaf_index >= arr_size)
+            goto error;
+        /*Ignore the prefix*/
+        if (node->prefixes[leaf_index] > prefix_len) {
+            goto finish;
+        } else if (node->prefixes[leaf_index] == prefix_len) {
+            set_fib_index(node, leaf_index, fib_index, prefix_len);
+        } else {
+            /*hole punching*/
+            bitmap_hp = bitmap << consecutive_leafs;
+            if (!(node->leafvec & bitmap_hp)) {
+                index_hp = hweight64(node->leafvec & (bitmap_hp - 1)) - 1;
+                if (node->prefixes[index_hp] <= prefix_len) {
+                    insert_leaf(node, index_hp, fib_index, prefix_len);
+                    node->leafvec |= bitmap_hp;
+                }
+
+                for (i = leaf_index; i < index_hp ; i++) {
+                    if (node->prefixes[i] <= prefix_len)
+                        set_fib_index(node, i, fib_index, prefix_len);
+                }
+            } else {
+                index_hp = hweight64(node->leafvec & (bitmap_hp - 1)) - 1;
+                for (i = leaf_index; i <= index_hp ; i++) {
+                    if (node->prefixes[i] <= prefix_len)
+                        set_fib_index(node, i, fib_index, prefix_len);
+                }
+            }
+        }
+    } else if (!(node->vector & bitmap)) {
+        /*No leaf for this index, so need to insert a leaf*/
+        leaf_index = hweight64(node->leafvec & (bitmap - 1));
+        insert_leaf(node, leaf_index, fib_index, prefix_len);
+        node->leafvec |= bitmap;
+    } else if (node->vector & bitmap && !(node->leafvec & bitmap)) {
+        /*There is a leaf for this index created by another
+         *  prefix with smaller length
+         */
+        prev_leaf_index = hweight64(node->leafvec & (bitmap - 1)) - 1;
+        arr_size = (int)hweight64(node->leafvec);
+        if (prev_leaf_index >= arr_size)
+            goto error;
+        if (node->prefixes[prev_leaf_index] <= prefix_len) {
+            insert_leaf(node, prev_leaf_index + 1, fib_index, prefix_len);
+            node->leafvec |= bitmap;
+        }
+
+        /*hole punching*/
+        prev_fib_index = node->leaves[prev_leaf_index];
+        prev_prefix_len = node->prefixes[prev_leaf_index];
+
+        bitmap_hp = bitmap << consecutive_leafs;
+        if (!(node->leafvec & bitmap_hp)) {
+            index_hp = hweight64(node->leafvec & (bitmap_hp - 1)) - 1;
+            if (node->prefixes[index_hp] <= prefix_len) {
+                if (prev_leaf_index < 0)
+                    goto error;
+                insert_leaf(node, index_hp + 1,
+                        prev_fib_index, prev_prefix_len);
+                node->leafvec |= bitmap_hp;
+            }
+        }
+
+        for (i = 2; i < consecutive_leafs; i++) {
+            bitmap_hp = bitmap << (i - 1);
+            if (node->leafvec & bitmap_hp) {
+                index_hp = hweight64(node->leafvec & (bitmap_hp - 1)) - 1;
+                insert_leaf(node, index_hp + 1,
+                        fib_index, prefix_len);
+                node->leafvec |= bitmap_hp;
+            }
+        }
+    }
+
+    if (consecutive_leafs > 1)
+        node->vector |= ((1ULL << consecutive_leafs) - 1) << index;
+    else
+        node->vector |= bitmap;
+
+    goto finish;
+
+error:
+    pr_err("Something is very wrong !!!!");
+finish:
+    spin_unlock(&pt->lock);
+}
+
+/*We assume that pt->root is not NULL*/
+void poptrie_lookup(struct poptrie *pt, __be32 dest, struct net_device **dev)
+{
+    register u32 index;
+    register u64 bitmap, bitmask;
+    register unsigned long leaf_index;
+    register unsigned long node_index;
+    register struct poptrie_node *node = pt->root;
+    register u8 fib_index = pt->def_nh;
+    register u8 carry = 0;
+    register u8 carry_bit = 2;
+
+    while (1) {
+        /*Extract 6 bytes from dest */
+        if (likely(carry_bit != 8)) {
+            index = ((dest & 252) >> carry_bit) | carry;
+            carry = (dest & ((1 << carry_bit) - 1)) << (6 - carry_bit);
+            carry_bit = carry_bit + 2;
+            dest = dest >> 8;
+        } else {
+            index = carry;
+            carry = 0;
+            carry_bit = 2;
+        }
+
+        /*Create a bitmap based on the the extracted value*/
+        bitmap = 1ULL << index;
+        bitmask = bitmap - 1;
+
+        /*Find corresponding leaf*/
+        if (likely(node->vector & bitmap)) {
+            leaf_index = hweight64(node->leafvec & bitmask);
+            if (!(node->leafvec & bitmap))
+                leaf_index--;
+            fib_index = node->leaves[leaf_index];
+        }
+
+        /*Find corresponding node*/
+        if (likely(node->nodevec & bitmap)) {
+            node_index = hweight64(node->nodevec & bitmask);
+            node = &node->chield_nodes[node_index];
+            continue;
+        }
+
+        *dev = get_fib(&pt->nhs, fib_index);
+        return;
+    }
+}
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 3dcffd3..0509a24 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1280,6 +1280,9 @@ int fib_table_insert(struct net *net, struct
fib_table *tb,
     if (err)
         goto out_fib_notif;

+    /*This should be done when Poptrie is enabled from CONFIG*/
+    poptrie_insert(&tb->pt, key, plen, fi->fib_dev);
+
     if (!plen)
         tb->tb_num_default++;

-- 
2.7.4

^ permalink raw reply related

* Re: KASAN: invalid-free in p9stat_free
From: Dominique Martinet @ 2018-08-27  5:24 UTC (permalink / raw)
  To: syzbot
  Cc: davem, ericvh, linux-kernel, lucho, netdev, syzkaller-bugs,
	v9fs-developer
In-Reply-To: <000000000000af648b057456e234@google.com>

syzbot wrote on Sun, Aug 26, 2018:
> HEAD commit:    e27bc174c9c6 Add linux-next specific files for 20180824
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=15dc19a6400000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=28446088176757ea
> dashboard link: https://syzkaller.appspot.com/bug?extid=d4252148d198410b864f
> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=15f8efba400000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1178256a400000
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+d4252148d198410b864f@syzkaller.appspotmail.com
> 
> random: sshd: uninitialized urandom read (32 bytes read)
> random: sshd: uninitialized urandom read (32 bytes read)
> random: sshd: uninitialized urandom read (32 bytes read)
> random: sshd: uninitialized urandom read (32 bytes read)
> ==================================================================
> BUG: KASAN: double-free or invalid-free in p9stat_free+0x35/0x100
> net/9p/protocol.c:48

That looks straight-forward enough, p9pdu_vreadf does p9stat_free on
error then v9fs_dir_readdir does the same ; there is nothing else that
could return an error without going through the first free so we could
just remove the later one...

There are a couple other users of the 'S' pdu read (that reads the stat
struct and frees it on error), so it's probably best to keep the current
behaviour as far as this is concerned, what we could do though is make
the free function idempotent (write NULLs in the freed fields), but I do
not see this being done often, do you know what the policy is about
this kind of pattern nowadays?

The struct is cleanly zeroed before being read so there is no risk of
double-frees between iterations so zeroing pointers is not strictly
required, but it does make things safer in general.


-- 
Dominique Martinet

^ permalink raw reply

* Urgent,
From: Juliet Muhammad @ 2018-08-27  0:47 UTC (permalink / raw)
  To: Recipients

i have been trying to contact you

^ permalink raw reply

* Re: [PATCH] net: sched: Fix memory exposure from short TCA_U32_SEL
From: Al Viro @ 2018-08-27  4:04 UTC (permalink / raw)
  To: Julia Lawall
  Cc: Joe Perches, Kees Cook, LKML, Jamal Hadi Salim, Cong Wang,
	Jiri Pirko, David S. Miller, Network Development
In-Reply-To: <alpine.DEB.2.21.1808262319000.2295@hadrien>

On Sun, Aug 26, 2018 at 11:35:17PM -0400, Julia Lawall wrote:

> * x = \(kmalloc\|kzalloc\|devm_kmalloc\|devm_kzalloc\)(...)

I can name several you've missed right off the top of my head -
vmalloc, kvmalloc, kmem_cache_alloc, kmem_cache_zalloc, variants
with _trace slapped on, and that is not to mention the things like
get_free_page or

void *my_k3wl_alloc(u64 n) // 'cause all artificial limits suck, that's why
{
	lots and lots of home-grown stats collection
	some tracepoints thrown in just for fun
	return kmalloc(n);
}

(and no, I'm not implying that net/sched folks had done anything of that
sort; I have seen that and worse in drivers, though)

> The * at the beginning of the line means to highlight what you are looking
> for, which is done by making a diff in which the highlighted line
> appears to be removed.

Umm...  Does that cover return, BTW?  Or something like
	T *barf;
	extern void foo(T *p);
	foo(kmalloc(sizeof(*barf)));


> The limitation is the ability to figure out the type of x.  If it is a
> local variable, Coccinelle should have no problem.  If it is a structure
> field, it may be necessary to provide command line arguments like
> 
> --all-includes --include-headers-for-types
> 
> --all-includes means to try to find all include files that are mentioned
> in the .c file.  The next stronger option is --recursive includes, which
> means include what all of the mentioned files include as well,
> recursively.  This tends to cause a major performance hit, because a lot
> of code is being parsed.  --include-headers-for-types heals a bit with
> that, as it only considers the header files when computing type
> information, and now when applying the rules.
> 
> With respect to ifdefs around variable declarations and structure field
> declaration, in these cases Coccinelle considers that it cannot make the
> ifdef have an if-like control flow, and so if considers the #ifdef, #else
> and #endif to be comments.  Thus it takes into account only the last type
> provided for a given variable.

[snip]

What about several variants of structure definition?  Because ifdefs around
includes do occur in the wild...

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox