All of lore.kernel.org
 help / color / mirror / Atom feed
* Comments about IPT_ALIGN
@ 2003-01-26  3:57 Thomas Heinz
  2003-01-26 11:01 ` Laszlo Valko
  0 siblings, 1 reply; 19+ messages in thread
From: Thomas Heinz @ 2003-01-26  3:57 UTC (permalink / raw)
  To: netfilter-devel

Hi

We all know the IPT_ALIGN macro defined in
include/linux/netfilter_ipv4/ip_tables.h:

#define IPT_ALIGN(s) (((s) + (__alignof__(struct ipt_entry)-1)) & 
~(__alignof__(struct ipt_entry)-1))

shorter:- on 32 bit architectures: output the smallest k = 4*k' >= s
           IPT_ALIGN(s) = (s + 3) & 0xffffff00
         - on 64 bit architectures: output the smallest k = 8*k' >= s
           IPT_ALIGN(s) = (s + 7) & 0xfffffffffffff000

The macro is used to align the match and target data that is transferred
from user to kernel space. Obviously we would not need such a macro
if there were no differences between the user and the kernel space
architecture which actually is the case for sparc64 and parisc64
(both 32 bit user space (possible) but 64 bit kernel space).

Now, it is interesting to have a closer look at the situations when
alignment _really_ becomes an issue, i.e. the data is aligned
differently in 32 bit user space and 64 bit kernel space.

I did a quick test on 3 machines: x86, sparc64, alpha. Each test ran in
user space only (this is not a problem if the behaviour of alpha in
user space is the same as sparc64 in kernel space as far as alignment
is concerned ... I hope this is the case :-).

Here is the ugly code:
#include <linux/types.h>
#include <stddef.h>
#include <stdio.h>

#define S(s)    (unsigned) sizeof(struct s)
#define A(s)    (unsigned) __alignof__(struct s)
#define O(s, m) (unsigned) offsetof(struct s, m)

int
main(void)
{
	struct t8
	{
		__u8 a;
		__u8 b[14];
		__u8 c;
	};
	struct t16
	{
		__u8 a;
		__u8 b[12];
		__u16 c;
		__u8 d[3];
		__u16 e[3];
	};
	struct t32
	{
		__u8 a;
		__u16 b[2];
		__u8 c;
		__u16 d[3];
		__u32 e[3];
		__u8 f;
	};
	struct t64
	{
		__u8 a;
		__u8 b[8];
		__u16 c;
		__u64 d[2];
		__u32 e;
		__u64 f;
	};
	struct tptr
	{
		__u8 a;
		void *p;
	};
	struct temb1
	{
		struct t8 a;
		struct t16 b;
	};
	struct temb2
	{
		struct t8 a;
		struct t64 b;
	};

	printf("sizeof(void *) = %u\n", (unsigned) sizeof(void *));
	printf("t8:    size: %2u   struct align: %2u   member offset: "
	       "%2u, %2u, %2u\n", S(t8), A(t8), O(t8, a), O(t8, b), O(t8, c));
	printf("t16:   size: %2u   struct align: %2u   member offset: "
	       "%2u, %2u, %2u, %2u, %2u\n", S(t16), A(t16), O(t16, a),
	       O(t16, b), O(t16, c), O(t16, d), O(t16, e));
	printf("t32:   size: %2u   struct align: %2u   member offset: "
	       "%2u, %2u, %2u, %2u, %2u, %2u\n", S(t32), A(t32), O(t32, a),
	       O(t32, b), O(t32, c), O(t32, d), O(t32, e), O(t32, f));
	printf("t64:   size: %2u   struct align: %2u   member offset: "
	       "%2u, %2u, %2u, %2u, %2u, %2u\n", S(t64), A(t64),
	       O(t64, a), O(t64, b), O(t64, c), O(t64, d), O(t64, e),
	       O(t64, f));
	printf("tptr:  size: %2u   struct align: %2u   member offset: "
	       "%2u, %2u\n", S(tptr), A(tptr), O(tptr, a), O(tptr, p));
	printf("temb1: size: %2u   struct align: %2u   member offset: "
	       "%2u, %2u\n", S(temb1), A(temb1), O(temb1, a), O(temb1, b));
	printf("temb2: size: %2u   struct align: %2u   member offset: "
	       "%2u, %2u\n", S(temb2), A(temb2), O(temb2, a), O(temb2, b));
	
	return 0;
}


Here is the output:
x86:
----
sizeof(void *) = 4
t8:    size: 16   struct align:  1   member offset:  0,  1, 15
t16:   size: 26   struct align:  2   member offset:  0,  1, 14, 16, 20
t32:   size: 32   struct align:  4   member offset:  0,  2,  6,  8, 16, 28
t64:   size: 40   struct align:  4   member offset:  0,  1, 10, 12, 28, 32
tptr:  size:  8   struct align:  4   member offset:  0,  4
temb1: size: 42   struct align:  2   member offset:  0, 16
temb2: size: 56   struct align:  4   member offset:  0, 16

sparc64:
--------
sizeof(void *) = 4
t8:    size: 16   struct align:  1   member offset:  0,  1, 15
t16:   size: 26   struct align:  2   member offset:  0,  1, 14, 16, 20
t32:   size: 32   struct align:  4   member offset:  0,  2,  6,  8, 16, 28
t64:   size: 48   struct align:  8   member offset:  0,  1, 10, 16, 32, 40
tptr:  size:  8   struct align:  4   member offset:  0,  4
temb1: size: 42   struct align:  2   member offset:  0, 16
temb2: size: 64   struct align:  8   member offset:  0, 16

alpha:
------
sizeof(void *) = 8
t8:    size: 16   struct align:  1   member offset:  0,  1, 15
t16:   size: 26   struct align:  2   member offset:  0,  1, 14, 16, 20
t32:   size: 32   struct align:  4   member offset:  0,  2,  6,  8, 16, 28
t64:   size: 48   struct align:  8   member offset:  0,  1, 10, 16, 32, 40
tptr:  size: 16   struct align:  8   member offset:  0,  8
temb1: size: 42   struct align:  2   member offset:  0, 16
temb2: size: 64   struct align:  8   member offset:  0, 16


Conclusions:

Except for struct tptr both sparc64 and alpha have the same struct
alignment and member offsets for all structs.
struct tptr is different because sizeof(void *) is different on alpha
and sparc64 (in user space).

Now, imagine that only those structs were allowed for which the
following properties hold:
             - basic datatype must be of fixed width (e.g. __u8, __u16,
               __u32, __u64)
             - members of the struct can be of:
                  * basic type
                  * array of basic type
                  * array of struct (for which the same properties hold)
                  * struct (for which the same properties hold)

In this case the IPT_ALIGN macro is _not_ needed!

So I come to the conclusion that if no pointers within target or match
structs would be needed IPT_ALIGN can be dropped without any
problems (if the assumption about the basic datatype holds).

What do you think about this? Am I wrong/right? Do I miss anything?
Is the stuff stated above trivially obvious ;-)?


Regards,

Thomas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Comments about IPT_ALIGN
  2003-01-26  3:57 Comments about IPT_ALIGN Thomas Heinz
@ 2003-01-26 11:01 ` Laszlo Valko
  2003-01-26 11:28   ` Anders Fugmann
  2003-01-26 17:06   ` Thomas Heinz
  0 siblings, 2 replies; 19+ messages in thread
From: Laszlo Valko @ 2003-01-26 11:01 UTC (permalink / raw)
  To: Thomas Heinz; +Cc: netfilter-devel

On Sun, Jan 26, 2003 at 04:57:37AM +0100, Thomas Heinz wrote:
> Conclusions:
> 
> Except for struct tptr both sparc64 and alpha have the same struct
> alignment and member offsets for all structs.

This is not too surprising. Both of them are RISC processors with
strict alignment restrictions. I guess even with a 32bit sparc
gcc, a 64bit unsigned integer gets aligned to 64bit just to make life
easier (I'm not sure if there's a 64bit load instruction on 32bit sparc's).

> struct tptr is different because sizeof(void *) is different on alpha
> and sparc64 (in user space).
> 
> Now, imagine that only those structs were allowed for which the
> following properties hold:
>              - basic datatype must be of fixed width (e.g. __u8, __u16,
>                __u32, __u64)
>              - members of the struct can be of:
>                   * basic type
>                   * array of basic type
>                   * array of struct (for which the same properties hold)
>                   * struct (for which the same properties hold)
> 
> In this case the IPT_ALIGN macro is _not_ needed!
>
> So I come to the conclusion that if no pointers within target or match
> structs would be needed IPT_ALIGN can be dropped without any
> problems (if the assumption about the basic datatype holds).
> 
> What do you think about this? Am I wrong/right? Do I miss anything?
> Is the stuff stated above trivially obvious ;-)?

You missed one point (currently (ab)used by netfilter):

* structures which are "put" next to each other (eg: ipt_replace and
  ipt_entry) must be examined carefully for alignment problems.

Otherwise you are right.

But this "put" next to each other thing can be rather nasty.

You can be sure a structure is padded at the end by the compiler to
whatever alignment is necessary for that structure. So if you have
a 64bit integer in it, it will be padded at the end to 64bit. To
ensure that arrays of that structure work.

Now the problem comes if you have only 32bit integers in the first
structure and 64bit integers in the second one: you have to either
use IPT_ALIGN again (which we would like to get rid of), or we
have to make sure the first structure gets aligned to 64bit.
We could introduce an extra variablke just for this at the end
like:

	u_int64_t dummy[0];
}

Or we could do something like that in netfilter:

	struct ipt_entry entries[0];
}

This latter version is better if we know what kind structure will follow,
because it will not increase the size of the structure unnecessarily on
platforms without alignment restrictions.

Regards,
Laszlo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Comments about IPT_ALIGN
  2003-01-26 11:01 ` Laszlo Valko
@ 2003-01-26 11:28   ` Anders Fugmann
  2003-01-26 13:58     ` Thomas Heinz
  2003-01-26 14:22     ` Laszlo Valko
  2003-01-26 17:06   ` Thomas Heinz
  1 sibling, 2 replies; 19+ messages in thread
From: Anders Fugmann @ 2003-01-26 11:28 UTC (permalink / raw)
  To: Laszlo Valko; +Cc: Thomas Heinz, netfilter-devel

Laszlo Valko wrote:
> On Sun, Jan 26, 2003 at 04:57:37AM +0100, Thomas Heinz wrote:
> 
>>Conclusions:
>>
>>Except for struct tptr both sparc64 and alpha have the same struct
>>alignment and member offsets for all structs.
> 
> 
> This is not too surprising. Both of them are RISC processors with
> strict alignment restrictions. I guess even with a 32bit sparc
> gcc, a 64bit unsigned integer gets aligned to 64bit just to make life
> easier (I'm not sure if there's a 64bit load instruction on 32bit sparc's).

Does this not pose an interresting problem?
What happens if userspace is 32bit, and kernel space is 64bit, when the 
elements of the structure changes size doe to conversion?
Eg:
struct x {
   int a;
   long int b;
   int c;
}

In 32bit userspace, x->a would have offset 0, x->b would have offset 4 
and x->c would have offset 8. But in 64bit kernel space, the long in 
would be 64 bit, and so the offset changes: offset x->a = 0, offset x->b 
  = 4, offset x->c = 12. The structure would then need a special 
conversion from 32 bit userspace to 64 bit kernel space in order for 
values in the struct to be accessed correctly. If this is true, then we 
must only use type with welldifined sizes, that is sizes that does not 
changes from userspace to kernel space.

Please correct me if I'm wrong.

Regards
Anders Fugmann

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Comments about IPT_ALIGN
  2003-01-26 11:28   ` Anders Fugmann
@ 2003-01-26 13:58     ` Thomas Heinz
  2003-01-26 14:26       ` Laszlo Valko
  2003-01-31 11:51       ` Harald Welte
  2003-01-26 14:22     ` Laszlo Valko
  1 sibling, 2 replies; 19+ messages in thread
From: Thomas Heinz @ 2003-01-26 13:58 UTC (permalink / raw)
  To: Anders Fugmann; +Cc: Laszlo Valko, netfilter-devel

Hi Anders

You wrote:
> Does this not pose an interresting problem?
> What happens if userspace is 32bit, and kernel space is 64bit, when the 
> elements of the structure changes size doe to conversion?
> Eg:
> struct x {
>   int a;
>   long int b;
>   int c;
> }
> 
> In 32bit userspace, x->a would have offset 0, x->b would have offset 4 
> and x->c would have offset 8. But in 64bit kernel space, the long in 
> would be 64 bit, and so the offset changes: offset x->a = 0, offset x->b 
>  = 4, offset x->c = 12. 

As member b of struct x must be 8 byte aligned we have
offsetof(struct x, b) = 8 (which implies offsetof(struct x, c) = 16).

In general the alignment requirement for a type is inherited by the
struct members (recursively). Of course there may be architectures
where there are no alignment requirements at all, i.e.
__alignof__(x) == 1.

> The structure would then need a special 
> conversion from 32 bit userspace to 64 bit kernel space in order for 
> values in the struct to be accessed correctly. If this is true, then we 
> must only use type with welldifined sizes, that is sizes that does not 
> changes from userspace to kernel space.

Actually this is not true for some existing targets and matches, e.g.:

     - ULOG: struct ipt_ulog_info {
                     unsigned int nl_group;
                     size_t copy_range;
                     size_t qthreshold;
                     char prefix[ULOG_PREFIX_LEN];
             };
     - MARK: struct ipt_mark_info {
                     unsigned long mark, mask;
                     u_int8_t invert;
             };
     - limit: struct ipt_rateinfo {
                      u_int32_t avg;    /* Average secs between packets 
* scale */
                      u_int32_t burst;  /* Period multiplier for upper 
limit. */
                     /* Used internally by the kernel */
                     unsigned long prev;
                     u_int32_t credit;
                     u_int32_t credit_cap, cost;

                     /* Ugly, ugly fucker. */
                     struct ipt_rateinfo *master;
              };

       The unsigned long prev in the middle does not hurt because it
       is automatically 8 byte aligned by avg and burst. Because
       the pointer (master) is at the end of the struct it does not
       cause any problems except that the whole struct must be
       aligned correctly.

Since I assume that ULOG and MARK work perfectly on sparc64 I must
be missing something that makes the internal memory layout of the
structs (i.e. the member offsets) equal in user and kernel space
(or my assumption is wrong ;-).

It would be kind if someone could fix my brainbug :)

BTW, pointers and arrays of pointers within target or match
structs can be forced to have the 8 byte alignment requirement
(on sparc64 and parisc64) by simply putting them into a
union: union {void *p; __u64 pad;}


Regards,

Thomas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Comments about IPT_ALIGN
  2003-01-26 11:28   ` Anders Fugmann
  2003-01-26 13:58     ` Thomas Heinz
@ 2003-01-26 14:22     ` Laszlo Valko
  2003-01-26 18:05       ` Thomas Heinz
  2003-01-31 11:55       ` Harald Welte
  1 sibling, 2 replies; 19+ messages in thread
From: Laszlo Valko @ 2003-01-26 14:22 UTC (permalink / raw)
  To: Anders Fugmann; +Cc: Thomas Heinz, netfilter-devel

On Sun, Jan 26, 2003 at 12:28:37PM +0100, Anders Fugmann wrote:
> Does this not pose an interresting problem?
> What happens if userspace is 32bit, and kernel space is 64bit, when the 
> elements of the structure changes size doe to conversion?
> Eg:
> struct x {
>    int a;
>    long int b;
>    int c;
> }
> 
> In 32bit userspace, x->a would have offset 0, x->b would have offset 4 
> and x->c would have offset 8. But in 64bit kernel space, the long in 
> would be 64 bit, and so the offset changes: offset x->a = 0, offset x->b 
>   = 4, offset x->c = 12. The structure would then need a special 
> conversion from 32 bit userspace to 64 bit kernel space in order for 
> values in the struct to be accessed correctly. If this is true, then we 

Yes, that structure needs a special conversion for 32/64 platforms.
That's the reason netfilter from plain 2.4.20 kernel does not work
correctly on sparc64 (with 32-bit userspace). Fortunately, there are
not too many such misdesigns in there :)

Even a structure like that is not better:
struct {
    long c;
};
Even if you IPT_ALIGN the whole thing, and memset with zeros, on bigendian
machines "c" (as seen from kernelside) will have its MSB 32-bit filled with
"c" from the userspace. That's the current situation with ipt_MARK for example.

> must only use type with welldifined sizes, that is sizes that does not 
> changes from userspace to kernel space.
> 
> Please correct me if I'm wrong.
> 
> Regards
> Anders Fugmann

You are right. There's only one question left: what and how to change to
a) make everything work on all platforms,
b) introduce as little incompatibilities as possible,
c) avoid kernel code duplication just to maintain backward compatibility,
d) avoid doing #if defined(__sparcv9__) || defined(__hppa64__) (or whatever
   it's called) in kernel headers (I guess it would be difficult to get that
   into mainline as far as I know DaveM, and not without grounds),
e) avoid defining local copies of hack kernel headers and doing something like
   #if sizeof(unsigned long) == 4 && defined (KERNEL_LOOKS_LIKE_SPARC64)
   in userspace (a maintenance nightmare to have several copies of the
   same structure)
all at the same time.

Currently, I see two ways:
1) make a translation function for all currently existing netfilter structures
   that use longs or pointers (difficult, as modules can be added independently),
   like those in arch/sparc64/sys_sparc32.c, and its friends,
2) introduce a new way of passing structures, with new structures, probably
   with new setsockopt/getsockopt numbers, essentially creating an API
   incompatible with the existing one (there are a few other bleeding wounds
   waiting to be solved in the existing interface as I saw).

Maybe combining those two to provide a compatibility layer with 1) and
introduce a new layer with 2). And making the old interface a config
option (CONFIG_NETFILTER_OLD_INTERFACE).

Now I'm interested what other people think about the possibilities...

Regards,
Laszlo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Comments about IPT_ALIGN
  2003-01-26 13:58     ` Thomas Heinz
@ 2003-01-26 14:26       ` Laszlo Valko
  2003-01-26 16:50         ` Thomas Heinz
  2003-01-31 11:51       ` Harald Welte
  1 sibling, 1 reply; 19+ messages in thread
From: Laszlo Valko @ 2003-01-26 14:26 UTC (permalink / raw)
  To: Thomas Heinz; +Cc: Anders Fugmann, netfilter-devel

On Sun, Jan 26, 2003 at 02:58:02PM +0100, Thomas Heinz wrote:
>      - MARK: struct ipt_mark_info {
>                      unsigned long mark, mask;
>                      u_int8_t invert;
>              };
>      - limit: struct ipt_rateinfo {
>                       u_int32_t avg;    /* Average secs between packets 
> * scale */
>                       u_int32_t burst;  /* Period multiplier for upper 
> limit. */
>                      /* Used internally by the kernel */
>                      unsigned long prev;
>                      u_int32_t credit;
>                      u_int32_t credit_cap, cost;
> 
>                      /* Ugly, ugly fucker. */
>                      struct ipt_rateinfo *master;
>               };
> 
>        The unsigned long prev in the middle does not hurt because it
>        is automatically 8 byte aligned by avg and burst. Because
>        the pointer (master) is at the end of the struct it does not
>        cause any problems except that the whole struct must be
>        aligned correctly.

The structure is aligned correctly, however the offsets of the fields,
and sizeof() is different, so it does not work.

> Since I assume that ULOG and MARK work perfectly on sparc64 I must
> be missing something that makes the internal memory layout of the
> structs (i.e. the member offsets) equal in user and kernel space
> (or my assumption is wrong ;-).

Well, I wrote a hack that makes the layout the same, but that's
rather ugly (I call it workaround :).

> It would be kind if someone could fix my brainbug :)
> 
> BTW, pointers and arrays of pointers within target or match
> structs can be forced to have the 8 byte alignment requirement
> (on sparc64 and parisc64) by simply putting them into a
> union: union {void *p; __u64 pad;}

Yes, that's true, except for the fact that you will have to deal
with endianness as well, if you are interested in actually exchanging
data... And don't forget the size change that brings in on every 32-bit
platform effectively killing compatibility...

Regards,
Laszlo


> 
> Regards,
> 
> Thomas
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Comments about IPT_ALIGN
  2003-01-26 14:26       ` Laszlo Valko
@ 2003-01-26 16:50         ` Thomas Heinz
  2003-01-26 19:31           ` Laszlo Valko
  0 siblings, 1 reply; 19+ messages in thread
From: Thomas Heinz @ 2003-01-26 16:50 UTC (permalink / raw)
  To: Laszlo Valko; +Cc: Anders Fugmann, netfilter-devel

Hi Laszlo

You wrote:
>>     - limit: struct ipt_rateinfo {
>>                      u_int32_t avg;    /* Average secs between packets 
>>* scale */
>>                      u_int32_t burst;  /* Period multiplier for upper 
>>limit. */
>>                     /* Used internally by the kernel */
>>                     unsigned long prev;
>>                     u_int32_t credit;
>>                     u_int32_t credit_cap, cost;
>>
>>                     /* Ugly, ugly fucker. */
>>                     struct ipt_rateinfo *master;
>>              };
>>
>>       The unsigned long prev in the middle does not hurt because it
>>       is automatically 8 byte aligned by avg and burst. Because
>>       the pointer (master) is at the end of the struct it does not
>>       cause any problems except that the whole struct must be
>>       aligned correctly.
> 
> The structure is aligned correctly, however the offsets of the fields,
> and sizeof() is different, so it does not work.

Ah yes of course, as the size of unsigned long differs in user and
kernel space every member after prev has a different offset.
So the limit match is a valid example :)

> Well, I wrote a hack that makes the layout the same, but that's
> rather ugly (I call it workaround :).

I just flew over it. Sorry, I must have overlooked it before.

>>BTW, pointers and arrays of pointers within target or match
>>structs can be forced to have the 8 byte alignment requirement
>>(on sparc64 and parisc64) by simply putting them into a
>>union: union {void *p; __u64 pad;}
> 
> Yes, that's true, except for the fact that you will have to deal
> with endianness as well, if you are interested in actually exchanging
> data... 

The only match/target I'm aware of that has a pointer in its info
struct is the limit match. Generally this is necessary for each
match/target that does not treat the info data as read only on a
smp system (at least with the current infrastructure as each cpu has
its own match/target info).

Anyway, I don't see a reason why one wants to exchange a valid pointer
between user and kernel space.

> And don't forget the size change that brings in on every 32-bit
> platform effectively killing compatibility...

One could use #ifdef's for compatibility purpose but as I read you
don't like these :-)


Thomas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Comments about IPT_ALIGN
  2003-01-26 11:01 ` Laszlo Valko
  2003-01-26 11:28   ` Anders Fugmann
@ 2003-01-26 17:06   ` Thomas Heinz
  1 sibling, 0 replies; 19+ messages in thread
From: Thomas Heinz @ 2003-01-26 17:06 UTC (permalink / raw)
  To: Laszlo Valko; +Cc: netfilter-devel

Hi Laszlo

You wrote:
> This is not too surprising. Both of them are RISC processors with
> strict alignment restrictions. I guess even with a 32bit sparc
> gcc, a 64bit unsigned integer gets aligned to 64bit just to make life
> easier

Yes, in fact this is the case. Just take a look at the output of
struct t64 on sparc64 in my first posting.

> Now the problem comes if you have only 32bit integers in the first
> structure and 64bit integers in the second one: you have to either
> use IPT_ALIGN again (which we would like to get rid of), or we
> have to make sure the first structure gets aligned to 64bit.
> We could introduce an extra variablke just for this at the end
> like:
> 
> 	u_int64_t dummy[0];
> }
> 
> Or we could do something like that in netfilter:
> 
> 	struct ipt_entry entries[0];
> }
> 
> This latter version is better if we know what kind structure will follow,
> because it will not increase the size of the structure unnecessarily on
> platforms without alignment restrictions.

Yes, very good point. Another way would be to introduce a new type
for this purpose which is u_int32_t on 32 bit architectures and
u_int64_t on 64 bit architectures or just use long instead if
sizeof(long) always works the expected way.

Anyway this is just a question of personal taste.


Thomas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Comments about IPT_ALIGN
  2003-01-26 14:22     ` Laszlo Valko
@ 2003-01-26 18:05       ` Thomas Heinz
  2003-01-26 19:43         ` Laszlo Valko
  2003-01-26 19:48         ` Thomas Heinz
  2003-01-31 11:55       ` Harald Welte
  1 sibling, 2 replies; 19+ messages in thread
From: Thomas Heinz @ 2003-01-26 18:05 UTC (permalink / raw)
  To: Laszlo Valko; +Cc: Anders Fugmann, netfilter-devel

Laszlo Valko wrote:
> You are right. There's only one question left: what and how to change to
> a) make everything work on all platforms,
> b) introduce as little incompatibilities as possible,
> c) avoid kernel code duplication just to maintain backward compatibility,
> d) avoid doing #if defined(__sparcv9__) || defined(__hppa64__) (or whatever
>    it's called) in kernel headers (I guess it would be difficult to get that
>    into mainline as far as I know DaveM, and not without grounds),
> e) avoid defining local copies of hack kernel headers and doing something like
>    #if sizeof(unsigned long) == 4 && defined (KERNEL_LOOKS_LIKE_SPARC64)
>    in userspace (a maintenance nightmare to have several copies of the
>    same structure)
> all at the same time.

Apart from breaking some existing targets and matches what about the
following requirements for match/target info structs:
        - the struct properties stated in my first posting must hold
          (of course unions can also be used :)
        - the end of the struct must be marked by:
              struct ipt_entry next[0];
          or
              long next[0];   // I'm not sure about this!
          or
              padd_t next[0]; // typedef __u32/__u64 padd_t dep. on arch
        - pointers or arrays of pointers should be avoided but if they're
          really needed (as in limit) a special pointer type should be
          used: 32 bit arch: union nf_ptr_t {void *p;
                                             u_int32_t padd;};
                64 bit arch: union nf_ptr_t {void *p;
                                             u_int64_t padd;};
          Maybe there are better ways to achieve the same
          (__attribute__ ((aligned (__alignof__(long)))) might work).
          Btw if only one pointer is needed it could be put at the end
          before the next member and it should work fine.

Of course this would break some matches/targets but apart from that fact
it should be ok, isn't it? I mean with this solution IPT_ALIGN is no
longer necessary and no #ifdefs need to be used except for padd_t (which
is just an option as there is at least one other way to make it work)
and maybe for nf_ptr_t.

> 2) introduce a new way of passing structures, with new structures, probably
>    with new setsockopt/getsockopt numbers, essentially creating an API
>    incompatible with the existing one (there are a few other bleeding wounds
>    waiting to be solved in the existing interface as I saw).

What do you mean by "other bleeding wounds"? Could you explain a bit
further?


Thomas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Comments about IPT_ALIGN
  2003-01-26 16:50         ` Thomas Heinz
@ 2003-01-26 19:31           ` Laszlo Valko
  0 siblings, 0 replies; 19+ messages in thread
From: Laszlo Valko @ 2003-01-26 19:31 UTC (permalink / raw)
  To: Thomas Heinz; +Cc: Anders Fugmann, netfilter-devel

Hi Thomas!

On Sun, Jan 26, 2003 at 05:50:21PM +0100, Thomas Heinz wrote:
> > Yes, that's true, except for the fact that you will have to deal
> > with endianness as well, if you are interested in actually exchanging
> > data... 
> 
> The only match/target I'm aware of that has a pointer in its info
> struct is the limit match. Generally this is necessary for each

There's one in struct ipt_replace (indeed that's not a match/target),
for the counters.

> match/target that does not treat the info data as read only on a
> smp system (at least with the current infrastructure as each cpu has
> its own match/target info).
> 
> Anyway, I don't see a reason why one wants to exchange a valid pointer
> between user and kernel space.

I have seen at least two guys here who asked how to exchange arbitrary
amount of data. And a pointer is an easy way out...

> > And don't forget the size change that brings in on every 32-bit
> > platform effectively killing compatibility...
> 
> One could use #ifdef's for compatibility purpose but as I read you
> don't like these :-)

I could put up with that one :) But I doubt that we could get that "solution"
accepted into mainline releases...

Regards,
Laszlo

> 
> 
> Thomas
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Comments about IPT_ALIGN
  2003-01-26 18:05       ` Thomas Heinz
@ 2003-01-26 19:43         ` Laszlo Valko
  2003-01-26 23:09           ` Thomas Heinz
  2003-01-26 19:48         ` Thomas Heinz
  1 sibling, 1 reply; 19+ messages in thread
From: Laszlo Valko @ 2003-01-26 19:43 UTC (permalink / raw)
  To: Thomas Heinz; +Cc: Anders Fugmann, netfilter-devel

Hi Thomas!

On Sun, Jan 26, 2003 at 07:05:09PM +0100, Thomas Heinz wrote:
> Apart from breaking some existing targets and matches what about the
> following requirements for match/target info structs:
>         - the struct properties stated in my first posting must hold
>           (of course unions can also be used :)
>         - the end of the struct must be marked by:
>               struct ipt_entry next[0];
>           or
>               long next[0];   // I'm not sure about this!
>           or
>               padd_t next[0]; // typedef __u32/__u64 padd_t dep. on arch

I don't have anything con/pro either of these, but I think for
an entry structure (which is followed by other entry structures) it
is wise to use struct ipt_entry next[0].

I'll check if there's already a generic "strictest alignment on this platform"
define in some Linux arch header.

>         - pointers or arrays of pointers should be avoided but if they're
>           really needed (as in limit) a special pointer type should be
>           used: 32 bit arch: union nf_ptr_t {void *p;
>                                              u_int32_t padd;};
>                 64 bit arch: union nf_ptr_t {void *p;
>                                              u_int64_t padd;};

Pointers to be passed between kernel/user space will not work easily
anyway, so this can only be used for kernel private pointers.
For that purpose, this is OK until the netfilter framework provides
facilities to eliminate pointers from kernel-user interface structures.

>           Maybe there are better ways to achieve the same
>           (__attribute__ ((aligned (__alignof__(long)))) might work).

The problem is that you _cannot_ determine from an x-bit userspace compiler
what alignment restrictions do exist on a y-bit kernelspace compiler. Since
they are two distinct compilers.
That's the fundamental flaw in __alignof__/IPT_ALIGN-type "hacking".

>           Btw if only one pointer is needed it could be put at the end
>           before the next member and it should work fine.

As long as the sizeof() structure does not change... Otherwise you'll
get EINVAL from check_entry.

> Of course this would break some matches/targets but apart from that fact
> it should be ok, isn't it? I mean with this solution IPT_ALIGN is no

Otherwise it's ok...

But please realize that struct ipt_entry next[0] at the end effectively
substitutes the missing IPT_ALIGN from the entry, maybe a more correct way.

> longer necessary and no #ifdefs need to be used except for padd_t (which
> is just an option as there is at least one other way to make it work)
> and maybe for nf_ptr_t.

> > 2) introduce a new way of passing structures, with new structures, probably
> >    with new setsockopt/getsockopt numbers, essentially creating an API
> >    incompatible with the existing one (there are a few other bleeding wounds
> >    waiting to be solved in the existing interface as I saw).
> 
> What do you mean by "other bleeding wounds"? Could you explain a bit
> further?
> 
> 
> Thomas

These are the ones that I see currently (mostly because of relevancy to
wordlength issues):

1) Versioning: we are unable to introduce changes into the interface without
   either breaking existing behaviour, or naming the child (module/socket
   option) differently. And we would need it right now :)

2) Kernel private data: there is currently no way to make a kernel-private
   structure that is not visible in the user-kernel interface (see the pointer
   in ipt_limit). I consider it an ugly hack to make a union to hide/cover
   a kernel-side pointer just because it's part of the interface. It should
   not be there at all.

3) Passing arbirary-sized data: I have seen at least two guys in the last
   month here asking how to do it - putting a pointer into the entry structure
   and having the module do copy_from_User is not the right way to do that.


Regards,
Laszlo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Comments about IPT_ALIGN
  2003-01-26 18:05       ` Thomas Heinz
  2003-01-26 19:43         ` Laszlo Valko
@ 2003-01-26 19:48         ` Thomas Heinz
  1 sibling, 0 replies; 19+ messages in thread
From: Thomas Heinz @ 2003-01-26 19:48 UTC (permalink / raw)
  To: Laszlo Valko; +Cc: Anders Fugmann, netfilter-devel

I wrote:
>        - the end of the struct must be marked by:
>              struct ipt_entry next[0];
>          or
>              long next[0];   // I'm not sure about this!

Well, this is plain nonsense (I don't know why I wrote that).
But here is another idea that does not need #ifdefs:

            struct info
            {
                [...]
            } __attribute__((aligned(__alignof__(u_int64_t))));

This hopefully does the trick.

>        - pointers or arrays of pointers should be avoided but if they're
>          really needed (as in limit) a special pointer type should be
>          used: 32 bit arch: union nf_ptr_t {void *p;
>                                             u_int32_t padd;};
>                64 bit arch: union nf_ptr_t {void *p;
>                                             u_int64_t padd;};
>          Maybe there are better ways to achieve the same
>          (__attribute__ ((aligned (__alignof__(long)))) might work).
                                                           ^^^^^^^^^^
                                             Of course it does _not_ :-)

            I don't think there is a way to avoid the union which
            leads to an #ifdef if we don't restrict ourselves to at most
            one pointer at the end.
            The reason for that is that it does not suffice to require
            a certain alignment for void * but it also must have the
            expected size. Otherwise members that come after pointer
            members might have a different offset (just the same
            argument as for long).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Comments about IPT_ALIGN
  2003-01-26 19:43         ` Laszlo Valko
@ 2003-01-26 23:09           ` Thomas Heinz
  2003-01-27  0:43             ` Laszlo Valko
  0 siblings, 1 reply; 19+ messages in thread
From: Thomas Heinz @ 2003-01-26 23:09 UTC (permalink / raw)
  To: Laszlo Valko; +Cc: Anders Fugmann, netfilter-devel

Hi Laszlo

You wrote:
>>        - pointers or arrays of pointers should be avoided but if they're
>>          really needed (as in limit) a special pointer type should be
>>          used: 32 bit arch: union nf_ptr_t {void *p;
>>                                             u_int32_t padd;};
>>                64 bit arch: union nf_ptr_t {void *p;
>>                                             u_int64_t padd;};
> 
> 
> Pointers to be passed between kernel/user space will not work easily
> anyway, so this can only be used for kernel private pointers.

True.

> For that purpose, this is OK until the netfilter framework provides
> facilities to eliminate pointers from kernel-user interface structures.

Yes, those pointers in the kernel-user interface are a real problem :)

>>          Maybe there are better ways to achieve the same
>>          (__attribute__ ((aligned (__alignof__(long)))) might work).
> 
> 
> The problem is that you _cannot_ determine from an x-bit userspace compiler
> what alignment restrictions do exist on a y-bit kernelspace compiler. Since
> they are two distinct compilers.
> That's the fundamental flaw in __alignof__/IPT_ALIGN-type "hacking".

Hm, but we agree that the alignment of types of fixed width is the
same even if x != y, e.g. __u64 is 8 byte aligned in sparc64 user and
kernel space. The problem of stacking structs with different alignment
requirements next to each other can be solved by aligning the structures
according to what you call "strictest alignment on this platform"
which IMO can be expressed by:
   struct foo {[...]}__attribute__((aligned(__alignof__(u_int64_t))));

Of course the problem of "valid user pointers" (in contrast to kernel
private pointers) remains but as you stated they simply should not be
used as they don't work as expected (at least on some architectures).

Furthermore, as far as I can see, the fact that one cannot use
dynamically sized structs is not caused by kernel-user structures
being too restrictive but by the userspace framework which forces
a target/match to tell its size at compile time.

> But please realize that struct ipt_entry next[0] at the end effectively
> substitutes the missing IPT_ALIGN from the entry, maybe a more correct way.

Yes, it's nothing more after all, but this would be the only problem
if we had no pointers and the members were of known bit width.

> 1) Versioning: we are unable to introduce changes into the interface without
>    either breaking existing behaviour, or naming the child (module/socket
>    option) differently. And we would need it right now :)
> 
> 2) Kernel private data: there is currently no way to make a kernel-private
>    structure that is not visible in the user-kernel interface (see the pointer
>    in ipt_limit). I consider it an ugly hack to make a union to hide/cover
>    a kernel-side pointer just because it's part of the interface. It should
>    not be there at all.
> 
> 3) Passing arbirary-sized data: I have seen at least two guys in the last
>    month here asking how to do it - putting a pointer into the entry structure
>    and having the module do copy_from_User is not the right way to do that.

Full ack.

Thanks for your explanations.


Thomas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Comments about IPT_ALIGN
  2003-01-26 23:09           ` Thomas Heinz
@ 2003-01-27  0:43             ` Laszlo Valko
  2003-01-27 10:33               ` Thomas Heinz
  0 siblings, 1 reply; 19+ messages in thread
From: Laszlo Valko @ 2003-01-27  0:43 UTC (permalink / raw)
  To: Thomas Heinz; +Cc: Anders Fugmann, netfilter-devel

Hi Thomas!

On Mon, Jan 27, 2003 at 12:09:14AM +0100, Thomas Heinz wrote:
> > The problem is that you _cannot_ determine from an x-bit userspace compiler
> > what alignment restrictions do exist on a y-bit kernelspace compiler. Since
> > they are two distinct compilers.
> > That's the fundamental flaw in __alignof__/IPT_ALIGN-type "hacking".
> 
> Hm, but we agree that the alignment of types of fixed width is the
> same even if x != y, e.g. __u64 is 8 byte aligned in sparc64 user and
> kernel space. The problem of stacking structs with different alignment

On sparc, it happens to be the same. I'm not sure how much pure luck
or intention this is. However, on x86, __u64 is definitely 4 byte aligned,
and I would expect x86-64 to do it differently (8 byte alignment).
This might even be a function of using different compiler options...

> requirements next to each other can be solved by aligning the structures
> according to what you call "strictest alignment on this platform"
> which IMO can be expressed by:
>    struct foo {[...]}__attribute__((aligned(__alignof__(u_int64_t))));

I think the "strictest alignment on this platform" should not come
from the compiler. It should be defined in some kernel header under
linux/include/asm-*.

Regards,
Laszlo

> Thomas
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Comments about IPT_ALIGN
  2003-01-27  0:43             ` Laszlo Valko
@ 2003-01-27 10:33               ` Thomas Heinz
  2003-01-31 11:57                 ` Harald Welte
  0 siblings, 1 reply; 19+ messages in thread
From: Thomas Heinz @ 2003-01-27 10:33 UTC (permalink / raw)
  To: Laszlo Valko; +Cc: Anders Fugmann, netfilter-devel

Hi Laszlo

You wrote:
> On sparc, it happens to be the same. I'm not sure how much pure luck
> or intention this is.

Good point. Maybe someone else can enlighten us.

> This might even be a function of using different compiler options...

Ok. If we have to deal with user and kernel space having different
alignment requirements for types of the same bit width but endianness
can be assumed to stay (at least for solution 1)) the same we have
(at least) two options:

1) Define a set of types with fixed bit width and fixed alignment
    requirements for each architecture which must also be made
    available in userspace (which implies that different type
    primitives might be used). The kernel-user structs may only use
    these types as basic types.

2) Define the kernel-user communication in terms of a network protocol
    (anyway, I guess this will be the case if netfilter communication is
    converted to netlink).

> I think the "strictest alignment on this platform" should not come
> from the compiler. It should be defined in some kernel header under
> linux/include/asm-*.

If one cannot make the necessary assumptions this is of course the only
way to define it properly.


Regards,

Thomas

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Comments about IPT_ALIGN
  2003-01-26 13:58     ` Thomas Heinz
  2003-01-26 14:26       ` Laszlo Valko
@ 2003-01-31 11:51       ` Harald Welte
  1 sibling, 0 replies; 19+ messages in thread
From: Harald Welte @ 2003-01-31 11:51 UTC (permalink / raw)
  To: Thomas Heinz; +Cc: Anders Fugmann, Laszlo Valko, netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 890 bytes --]

On Sun, Jan 26, 2003 at 02:58:02PM +0100, Thomas Heinz wrote:
 
> Actually this is not true for some existing targets and matches, e.g.:
 [...]

> Since I assume that ULOG and MARK work perfectly on sparc64 I must
> be missing something that makes the internal memory layout of the
> structs (i.e. the member offsets) equal in user and kernel space
> (or my assumption is wrong ;-).

I'd rather think of your assumption as being wrong.  I have never tested
ULOG on sparc64 so far. But thanks for bringing up this issue.

> Regards,
> Thomas

-- 
Live long and prosper
- Harald Welte / laforge@gnumonks.org               http://www.gnumonks.org/
============================================================================
GCS/E/IT d- s-: a-- C+++ UL++++$ P+++ L++++$ E--- W- N++ o? K- w--- O- M- 
V-- PS+ PE-- Y+ PGP++ t++ 5-- !X !R tv-- b+++ DI? !D G+ e* h+ r% y+(*)

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Comments about IPT_ALIGN
  2003-01-26 14:22     ` Laszlo Valko
  2003-01-26 18:05       ` Thomas Heinz
@ 2003-01-31 11:55       ` Harald Welte
  2003-01-31 23:37         ` Laszlo Valko
  1 sibling, 1 reply; 19+ messages in thread
From: Harald Welte @ 2003-01-31 11:55 UTC (permalink / raw)
  To: Laszlo Valko; +Cc: Anders Fugmann, Thomas Heinz, netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 1401 bytes --]

On Sun, Jan 26, 2003 at 03:22:27PM +0100, Laszlo Valko wrote:
> Currently, I see two ways:
> 1) make a translation function for all currently existing netfilter structures
>    that use longs or pointers (difficult, as modules can be added
>    independently), like those in arch/sparc64/sys_sparc32.c, and its
>    friends,

it's a hack, but it would work.

> 2) introduce a new way of passing structures, with new structures, probably
>    with new setsockopt/getsockopt numbers, essentially creating an API
>    incompatible with the existing one (there are a few other bleeding wounds
>    waiting to be solved in the existing interface as I saw).

please don't go for solution #2.  There is already one new interface
between kernel an userspace in development... we don't need more
interfaces than we already have.  

And yes, the new interface still has the matchinfo of all matches
concatenated in one memory area.  Everything else has linked lists,
though (e.g. every rule is a seperately allocated memory chunk).

> Regards,
> Laszlo

-- 
Live long and prosper
- Harald Welte / laforge@gnumonks.org               http://www.gnumonks.org/
============================================================================
GCS/E/IT d- s-: a-- C+++ UL++++$ P+++ L++++$ E--- W- N++ o? K- w--- O- M- 
V-- PS+ PE-- Y+ PGP++ t++ 5-- !X !R tv-- b+++ DI? !D G+ e* h+ r% y+(*)

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Comments about IPT_ALIGN
  2003-01-27 10:33               ` Thomas Heinz
@ 2003-01-31 11:57                 ` Harald Welte
  0 siblings, 0 replies; 19+ messages in thread
From: Harald Welte @ 2003-01-31 11:57 UTC (permalink / raw)
  To: Thomas Heinz; +Cc: Laszlo Valko, Anders Fugmann, netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 767 bytes --]

On Mon, Jan 27, 2003 at 11:33:55AM +0100, Thomas Heinz wrote:
> 2) Define the kernel-user communication in terms of a network protocol
>    (anyway, I guess this will be the case if netfilter communication is
>    converted to netlink).

No, not really. The matchinfo / targetinfo structures are just passed
through as opaque data objects, the iptnetlink/pkttnetlink layer doesn't
look inside at all.

> Regards,
> Thomas

-- 
Live long and prosper
- Harald Welte / laforge@gnumonks.org               http://www.gnumonks.org/
============================================================================
GCS/E/IT d- s-: a-- C+++ UL++++$ P+++ L++++$ E--- W- N++ o? K- w--- O- M- 
V-- PS+ PE-- Y+ PGP++ t++ 5-- !X !R tv-- b+++ DI? !D G+ e* h+ r% y+(*)

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Comments about IPT_ALIGN
  2003-01-31 11:55       ` Harald Welte
@ 2003-01-31 23:37         ` Laszlo Valko
  0 siblings, 0 replies; 19+ messages in thread
From: Laszlo Valko @ 2003-01-31 23:37 UTC (permalink / raw)
  To: Harald Welte; +Cc: Anders Fugmann, Thomas Heinz, netfilter-devel

Hi Harald!

On Fri, Jan 31, 2003 at 12:55:12PM +0100, Harald Welte wrote:
> On Sun, Jan 26, 2003 at 03:22:27PM +0100, Laszlo Valko wrote:
> > Currently, I see two ways:
> > 1) make a translation function for all currently existing netfilter structures
> >    that use longs or pointers (difficult, as modules can be added
> >    independently), like those in arch/sparc64/sys_sparc32.c, and its
> >    friends,
> 
> it's a hack, but it would work.

Well, we will definitely need it if people want to use netfilter on 2.4,
as that is a stable series, and we don't want to introduce incompatibilities
I guess.
When I arrive to the state of having a little spare time, I might convince
myself to sketch a first version up... It does not look too difficult at least.

> > 2) introduce a new way of passing structures, with new structures, probably
> >    with new setsockopt/getsockopt numbers, essentially creating an API
> >    incompatible with the existing one (there are a few other bleeding wounds
> >    waiting to be solved in the existing interface as I saw).
> 
> please don't go for solution #2.  There is already one new interface
> between kernel an userspace in development... we don't need more
> interfaces than we already have.  

Wow! May I see it? :)

> And yes, the new interface still has the matchinfo of all matches
> concatenated in one memory area.  Everything else has linked lists,
> though (e.g. every rule is a seperately allocated memory chunk).

That's all right, as long as the interface structures != kernel internal
structures. In fact kernel internal structure should be >= interface structure.
Otherwise you cannot solve the problem of using unsigned longs in
interface structures (eg jiffies is always unsigned long unfortunately).

Do we expect to change the modules as well? I mean struct ipt_anything et al.
are expected to change, is that so?


Regards,
Laszlo

> 
> -- 
> Live long and prosper
> - Harald Welte / laforge@gnumonks.org               http://www.gnumonks.org/
> ============================================================================
> GCS/E/IT d- s-: a-- C+++ UL++++$ P+++ L++++$ E--- W- N++ o? K- w--- O- M- 
> V-- PS+ PE-- Y+ PGP++ t++ 5-- !X !R tv-- b+++ DI? !D G+ e* h+ r% y+(*)

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2003-01-31 23:37 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-01-26  3:57 Comments about IPT_ALIGN Thomas Heinz
2003-01-26 11:01 ` Laszlo Valko
2003-01-26 11:28   ` Anders Fugmann
2003-01-26 13:58     ` Thomas Heinz
2003-01-26 14:26       ` Laszlo Valko
2003-01-26 16:50         ` Thomas Heinz
2003-01-26 19:31           ` Laszlo Valko
2003-01-31 11:51       ` Harald Welte
2003-01-26 14:22     ` Laszlo Valko
2003-01-26 18:05       ` Thomas Heinz
2003-01-26 19:43         ` Laszlo Valko
2003-01-26 23:09           ` Thomas Heinz
2003-01-27  0:43             ` Laszlo Valko
2003-01-27 10:33               ` Thomas Heinz
2003-01-31 11:57                 ` Harald Welte
2003-01-26 19:48         ` Thomas Heinz
2003-01-31 11:55       ` Harald Welte
2003-01-31 23:37         ` Laszlo Valko
2003-01-26 17:06   ` Thomas Heinz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.