* [PATCH 0/5] [DCCP]: Queuing policies @ 2008-04-11 10:24 Tomasz Grobelny 2008-04-14 6:50 ` Gerrit Renker 0 siblings, 1 reply; 47+ messages in thread From: Tomasz Grobelny @ 2008-04-11 10:24 UTC (permalink / raw) To: Arnaldo Carvalho de Melo; +Cc: dccp, netdev This is resubmission of queuing policy related patches for DCCP. Queuing policies allow userspace applications to more effectively use DCCP. Changes include: - CodingStyle, - do not allow applications to change qpolicy after accept or connect, - naming changes, - make dccps_qpolicy a non-pointer, - include dccpd_policy in dccp_skb_cb as a structure not char[16]. -- Regards, Tomasz Grobelny ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 0/5] [DCCP]: Queuing policies 2008-04-11 10:24 [PATCH 0/5] [DCCP]: Queuing policies Tomasz Grobelny @ 2008-04-14 6:50 ` Gerrit Renker 2008-04-14 7:39 ` [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set Gerrit Renker 0 siblings, 1 reply; 47+ messages in thread From: Gerrit Renker @ 2008-04-14 6:50 UTC (permalink / raw) To: Tomasz Grobelny; +Cc: Arnaldo Carvalho de Melo, dccp, netdev | This is resubmission of queuing policy related patches for DCCP. Queuing | policies allow userspace applications to more effectively use DCCP. Thanks a lot -- all patches applied cleanly. As promised, I have created a tree "qpolicy" for these patches, which can be checked out via # using a recent netdev-2.6 kernel git checkout -b qpolicy master git pull git://eden-feed.erg.abdn.ac.uk/dccp_exp qpolicy I have left your patches unchanged (apart from adapting them to apply on the most recent base). There are four other patches in this repository: I am interested in merging your patches into the test tree and thus have worked on integrating it. Rather than sending those new 4 patches individually, I will send the result (combined with your patch set) as a single RFC patch to the list. Please have a look at the repository and the subsequent patch -- I would like to know if you are ok with the changes and/or discuss these. The University of Aberdeen is a charity registered in Scotland, No SC013683. ^ permalink raw reply [flat|nested] 47+ messages in thread
* [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-14 6:50 ` Gerrit Renker @ 2008-04-14 7:39 ` Gerrit Renker 2008-04-14 23:45 ` Tomasz Grobelny 0 siblings, 1 reply; 47+ messages in thread From: Gerrit Renker @ 2008-04-14 7:39 UTC (permalink / raw) To: Tomasz Grobelny, Arnaldo Carvalho de Melo, dccp, netdev [DCCP] [RFC]: Reworked queueing policy patch set This is a re-worked version of Tomasz's priority-queueing patch set for DCCP. The essential internals of his patch set have been left untouched, in particular the prio/simple algorithms are the same as before. The changes for which I am asking for comments are: 1. Fewer files: --------------- The patch set originally introduced 4 new files: 50 net/dccp/qpolicy.c 48 net/dccp/qpolicy.h 70 net/dccp/qpolicy_prio.c 36 net/dccp/qpolicy_simple.c Given that each file has less than 100 lines, it seemed better to put them all into one file and export the declarations in dccp.h, the result is now: 143 net/dccp/qpolicy.c With less than 150 lines, there is still room for a few more policies. 2. Let pop() return result -------------------------- * at least for the forseeable future, the underlying queue is sk_write_queue; * an skb can not be on two lists at the same time; * thus skb_unlink will always be part of implementing pop(); * having pop() return the skb is more versatile (compare Assembler pop()). 3. Hide internals of implementation ----------------------------------- There had been discussion on hiding the internals, the present approach * uses symbolic identifiers (enum) to refer to policies; * makes the function table local to qpolicy.c, so that * dccp_sock only needs to take the policy identifier; * makes tx_qlen a socket member of dccp_sock (there was a bug in the implementation -- tx_qlen was shared by all sockets using the same policy); * default policy as well as inheriting policy to child socket now implicit: - default policy (ID=0) set due to zeroing socket on allocation, - inheriting policy due to performing sock_copy() on the policy ID; 4. Clerical changes ------------------- Some work done here, such as adding documentation, aligning identifiers etc. If people (Tomasz in particular) are ok with the changes, I would like to add this to the test tree, otherwise will keep it separate as subtree "qpolicy" in git://eden-feed.erg.abdn.ac.uk/dccp_exp during discussions. Gerrit --- Documentation/networking/dccp.txt | 11 ++ include/linux/dccp.h | 20 +++++ net/dccp/Makefile | 2 net/dccp/dccp.h | 45 +++++++---- net/dccp/output.c | 6 - net/dccp/proto.c | 26 ++++++ net/dccp/qpolicy.c | 143 ++++++++++++++++++++++++++++++++++++++ 7 files changed, 231 insertions(+), 22 deletions(-) --- /dev/null +++ b/net/dccp/qpolicy.c @@ -0,0 +1,143 @@ +/* + * net/dccp/qpolicy.c + * + * An implementation of the DCCP protocol + * + * Copyright (c) 2008 Tomasz Grobelny <tomasz@grobelny.oswiecenia.net> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License v2 + * as published by the Free Software Foundation. + */ +#include "dccp.h" + +/* + * Simple Dequeueing Policy + */ +static void simple_push(struct sock *sk, struct sk_buff *skb, + const struct dccp_packet_info *dcp) +{ + skb_queue_tail(&sk->sk_write_queue, skb); +} + +static int simple_full(struct sock *sk) +{ + return dccp_sk(sk)->dccps_tx_qlen && + sk->sk_write_queue.qlen >= dccp_sk(sk)->dccps_tx_qlen; +} + +static struct sk_buff *simple_top(struct sock *sk) +{ + return skb_peek(&sk->sk_write_queue); +} + +/* + * Priority-based Dequeueing Policy + */ +static struct sk_buff *prio_get_worst_skb(struct sock *sk) +{ + struct sk_buff *skb, *worst = NULL; + int worstp = -128; + + skb_queue_walk(&sk->sk_write_queue, skb) + if (DCCP_SKB_CB(skb)->dccpd_policy.priority >= worstp) { + worstp = DCCP_SKB_CB(skb)->dccpd_policy.priority; + worst = skb; + } + return worst; +} + +static void prio_push(struct sock *sk, struct sk_buff *skb, + const struct dccp_packet_info *dcp) +{ + struct dccp_skb_cb *dcb = DCCP_SKB_CB(skb); + + dcb->dccpd_policy.priority = 127; + if (dcp) + memcpy(&dcb->dccpd_policy, dcp, sizeof(*dcp)); + + skb_queue_tail(&sk->sk_write_queue, skb); + + if (sk->sk_write_queue.qlen >= dccp_sk(sk)->dccps_tx_qlen) { + struct sk_buff *worst = prio_get_worst_skb(sk); + + skb_unlink(worst, &sk->sk_write_queue); + kfree_skb(worst); + } +} + +/** we can always push a packet into the queue => queue is never full */ +static int prio_full(struct sock *sk) +{ + return 0; +} + +static struct sk_buff *prio_get_best_skb(struct sock *sk) +{ + struct sk_buff *skb, *best = NULL; + int bestp = 127; + + skb_queue_walk(&sk->sk_write_queue, skb) + if (DCCP_SKB_CB(skb)->dccpd_policy.priority <= bestp) { + bestp = DCCP_SKB_CB(skb)->dccpd_policy.priority; + best = skb; + } + return best; +} + +/** + * struct dccp_qpolicy_operations - TX Packet Dequeueing Interface + * @push: add a new @skb with possibly a struct dccp_packet_info + * @full: indicates that no more packets will be admitted + * @top: peeks at whatever the queueing policy defines as top + */ +static struct dccp_qpolicy_operations { + void (*push) (struct sock *sk, struct sk_buff *skb, + const struct dccp_packet_info *dcp); + int (*full) (struct sock *sk); + struct sk_buff* (*top) (struct sock *sk); + +} qpol_table[DCCPQ_POLICY_MAX] = { + [DCCPQ_POLICY_SIMPLE] = { + .push = simple_push, + .full = simple_full, + .top = simple_top, + }, + [DCCPQ_POLICY_PRIO] = { + .push = prio_push, + .full = prio_full, + .top = prio_get_best_skb, + }, +}; + +/* + * Wrappers for methods provided by policy currently in use + */ +void qpolicy_push(struct sock *sk, struct sk_buff *skb, struct msghdr *msg) +{ + struct dccp_packet_info *dcp = NULL; + + if (msg->msg_control != NULL && msg->msg_controllen == sizeof(*dcp)) + dcp = (struct dccp_packet_info *)msg->msg_control; + + qpol_table[dccp_sk(sk)->dccps_qpolicy].push(sk, skb, dcp); +} + +int qpolicy_full(struct sock *sk) +{ + return qpol_table[dccp_sk(sk)->dccps_qpolicy].full(sk); +} + +struct sk_buff *qpolicy_top(struct sock *sk) +{ + return qpol_table[dccp_sk(sk)->dccps_qpolicy].top(sk); +} + +struct sk_buff *qpolicy_pop(struct sock *sk) +{ + struct sk_buff *skb = qpolicy_top(sk); + + if (skb) + skb_unlink(skb, &sk->sk_write_queue); + return skb; +} --- a/include/linux/dccp.h +++ b/include/linux/dccp.h @@ -195,6 +195,20 @@ enum dccp_feature_numbers { DCCPF_MAX_CCID_SPECIFIC = 255, }; +/* DCCP TX Dequeueing Policy */ +enum dccp_packet_dequeueing_policy { + DCCPQ_POLICY_SIMPLE, + DCCPQ_POLICY_PRIO, + DCCPQ_POLICY_MAX +}; +/** + * struct dccp_packet_info - Per-packet TX dequeueing information + * @priority: packet priority, lower value => packet more likely to be sent + */ +struct dccp_packet_info { + __s8 priority; +}; + /* DCCP socket options */ #define DCCP_SOCKOPT_PACKET_SIZE 1 /* XXX deprecated, without effect */ #define DCCP_SOCKOPT_SERVICE 2 @@ -208,6 +222,8 @@ enum dccp_feature_numbers { #define DCCP_SOCKOPT_CCID 13 #define DCCP_SOCKOPT_TX_CCID 14 #define DCCP_SOCKOPT_RX_CCID 15 +#define DCCP_SOCKOPT_QPOLICY_ID 16 +#define DCCP_SOCKOPT_QPOLICY_TXQLEN 17 #define DCCP_SOCKOPT_CCID_RX_INFO 128 #define DCCP_SOCKOPT_CCID_TX_INFO 192 @@ -459,6 +475,8 @@ struct dccp_ackvec; * @dccps_hc_rx_ccid - CCID used for the receiver (or receiving half-connection) * @dccps_hc_tx_ccid - CCID used for the sender (or sending half-connection) * @dccps_options_received - parsed set of retrieved options + * @dccps_qpolicy - TX dequeueing policy, one of %dccp_packet_dequeueing_policy + * @dccps_tx_qlen - maximum length of the TX queue * @dccps_role - role of this sock, one of %dccp_role * @dccps_hc_rx_insert_options - receiver wants to add options when acking * @dccps_hc_tx_insert_options - sender wants to add options when sending @@ -501,6 +519,8 @@ struct dccp_sock { struct ccid *dccps_hc_rx_ccid; struct ccid *dccps_hc_tx_ccid; struct dccp_options_received dccps_options_received; + __u8 dccps_qpolicy; + __u32 dccps_tx_qlen; enum dccp_role dccps_role:2; __u8 dccps_hc_rx_insert_options:1; __u8 dccps_hc_tx_insert_options:1; --- a/net/dccp/proto.c +++ b/net/dccp/proto.c @@ -187,6 +187,7 @@ int dccp_init_sock(struct sock *sk, cons dp->dccps_rate_last = jiffies; dp->dccps_role = DCCP_ROLE_UNDEFINED; dp->dccps_service = DCCP_SERVICE_CODE_IS_ABSENT; + dp->dccps_tx_qlen = sysctl_dccp_tx_qlen; INIT_LIST_HEAD(&dp->dccps_featneg); /* control socket doesn't need feat nego */ @@ -540,6 +541,20 @@ static int do_dccp_setsockopt(struct soc case DCCP_SOCKOPT_RECV_CSCOV: err = dccp_setsockopt_cscov(sk, val, true); break; + case DCCP_SOCKOPT_QPOLICY_ID: + if (sk->sk_state != DCCP_CLOSED) + err = -EISCONN; + else if (val < 0 || val >= DCCPQ_POLICY_MAX) + err = -EINVAL; + else + dp->dccps_qpolicy = val; + break; + case DCCP_SOCKOPT_QPOLICY_TXQLEN: + if (val < 0) + err = -EINVAL; + else + dp->dccps_tx_qlen = val; + break; default: err = -ENOPROTOOPT; break; @@ -643,6 +658,12 @@ static int do_dccp_getsockopt(struct soc case DCCP_SOCKOPT_RECV_CSCOV: val = dp->dccps_pcrlen; break; + case DCCP_SOCKOPT_QPOLICY_ID: + val = dp->dccps_qpolicy; + break; + case DCCP_SOCKOPT_QPOLICY_TXQLEN: + val = dp->dccps_tx_qlen; + break; case 128 ... 191: return ccid_hc_rx_getsockopt(dp->dccps_hc_rx_ccid, sk, optname, len, (u32 __user *)optval, optlen); @@ -700,8 +721,7 @@ int dccp_sendmsg(struct kiocb *iocb, str lock_sock(sk); - if (sysctl_dccp_tx_qlen && - (sk->sk_write_queue.qlen >= sysctl_dccp_tx_qlen)) { + if (qpolicy_full(sk)) { rc = -EAGAIN; goto out_release; } @@ -729,7 +749,7 @@ int dccp_sendmsg(struct kiocb *iocb, str if (rc != 0) goto out_discard; - skb_queue_tail(&sk->sk_write_queue, skb); + qpolicy_push(sk, skb, msg); dccp_write_xmit(sk); out_release: release_sock(sk); --- a/net/dccp/dccp.h +++ b/net/dccp/dccp.h @@ -215,6 +215,18 @@ extern void dccp_reqsk_send_ack(struct s extern void dccp_send_sync(struct sock *sk, const u64 seq, const enum dccp_pkt_type pkt_type); +/* + * TX Packet Dequeueing Interface + */ +extern void qpolicy_push(struct sock *sk, struct sk_buff *skb, + struct msghdr *msg); +extern int qpolicy_full(struct sock *sk); +extern struct sk_buff *qpolicy_top(struct sock *sk); +extern struct sk_buff *qpolicy_pop(struct sock *sk); + +/* + * TX Packet Output and TX Timers + */ extern void dccp_write_xmit(struct sock *sk); extern void dccp_write_space(struct sock *sk); extern void dccp_flush_write_queue(struct sock *sk, long *time_budget); @@ -313,14 +325,16 @@ static inline int dccp_bad_service_code( /** * dccp_skb_cb - DCCP per-packet control information - * @dccpd_type: one of %dccp_pkt_type (or unknown) - * @dccpd_ccval: CCVal field (5.1), see e.g. RFC 4342, 8.1 - * @dccpd_reset_code: one of %dccp_reset_codes - * @dccpd_reset_data: Data1..3 fields (depend on @dccpd_reset_code) - * @dccpd_opt_len: total length of all options (5.8) in the packet - * @dccpd_seq: sequence number - * @dccpd_ack_seq: acknowledgment number subheader field value - * This is used for transmission as well as for reception. + * @header: IPv4/6 parameters + * @dccpd_type: one of %dccp_pkt_type (or unknown) + * @dccpd_ccval: CCVal field (5.1), see e.g. RFC 4342, 8.1 + * @dccpd_reset_code: one of %dccp_reset_codes + * @dccpd_reset_data: Data1..3 fields (depend on @dccpd_reset_code) + * @dccpd_opt_len: total length of all options (5.8) in the packet + * @dccpd_seq: sequence number + * @dccpd_ack_seq: acknowledgment number subheader field value + * @dccpd_policy: packet dequeueing policy (TX packets only) + * This struct is used for transmission as well as for reception. */ struct dccp_skb_cb { union { @@ -329,13 +343,14 @@ struct dccp_skb_cb { struct inet6_skb_parm h6; #endif } header; - __u8 dccpd_type:4; - __u8 dccpd_ccval:4; - __u8 dccpd_reset_code, - dccpd_reset_data[3]; - __u16 dccpd_opt_len; - __u64 dccpd_seq; - __u64 dccpd_ack_seq; + __u8 dccpd_type:4; + __u8 dccpd_ccval:4; + __u8 dccpd_reset_code, + dccpd_reset_data[3]; + __u16 dccpd_opt_len; + __u64 dccpd_seq; + __u64 dccpd_ack_seq; + struct dccp_packet_info dccpd_policy; }; #define DCCP_SKB_CB(__skb) ((struct dccp_skb_cb *)&((__skb)->cb[0])) --- a/Documentation/networking/dccp.txt +++ b/Documentation/networking/dccp.txt @@ -45,6 +45,17 @@ http://linux-net.osdl.org/index.php/DCCP Socket options ============== +DCCP_SOCKOPT_QPOLICY_ID sets the dequeueing policy for outgoing packets. It +takes an integer number as argument and can only be set before establishing +a connection (i.e. changes during an established connection are not supported). +Currently defined IDs are "simple" (DCCPQ_POLICY_SIMPLE) and a priority-based +variant (DCCPQ_POLICY_PRIO). In the latter case, a struct dccp_packet_info can +be passed as ancillary data in the msg_control field of sendmsg(2). + +DCCP_SOCKOPT_QPOLICY_TXQLEN sets the maximum TX queue length. This is relevant only for +those TX dequeueing policies that honour this upper bound (currently only the +simple/default policy does this). This socket option overrides the default value +of the sysctl /proc/sys/net/dccp/default/tx_qlen. DCCP_SOCKOPT_SERVICE sets the service. The specification mandates use of service codes (RFC 4340, sec. 8.1.2); if this socket option is not set, --- a/net/dccp/output.c +++ b/net/dccp/output.c @@ -238,7 +238,7 @@ static void dccp_xmit_packet(struct sock { int err, len; struct dccp_sock *dp = dccp_sk(sk); - struct sk_buff *skb = skb_dequeue(&sk->sk_write_queue); + struct sk_buff *skb = qpolicy_pop(sk); if (unlikely(skb == NULL)) return; @@ -328,7 +328,7 @@ void dccp_write_xmit(struct sock *sk) struct dccp_sock *dp = dccp_sk(sk); struct sk_buff *skb; - while ((skb = skb_peek(&sk->sk_write_queue))) { + while ((skb = qpolicy_top(sk))) { int rc = ccid_hc_tx_send_packet(dp->dccps_hc_tx_ccid, sk, skb); switch (ccid_packet_dequeue_eval(rc)) { @@ -342,7 +342,7 @@ void dccp_write_xmit(struct sock *sk) dccp_xmit_packet(sk); break; case CCID_PACKET_ERR: - skb_dequeue(&sk->sk_write_queue); + qpolicy_pop(sk); kfree_skb(skb); dccp_pr_debug("packet discarded due to err=%d\n", rc); } --- a/net/dccp/Makefile +++ b/net/dccp/Makefile @@ -1,7 +1,7 @@ obj-$(CONFIG_IP_DCCP) += dccp.o dccp_ipv4.o dccp-y := ccid.o feat.o input.o minisocks.o options.o \ - output.o proto.o timer.o ackvec.o + output.o proto.o timer.o ackvec.o qpolicy.o dccp_ipv4-y := ipv4.o The University of Aberdeen is a charity registered in Scotland, No SC013683. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-14 7:39 ` [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set Gerrit Renker @ 2008-04-14 23:45 ` Tomasz Grobelny 2008-04-15 15:14 ` Gerrit Renker 0 siblings, 1 reply; 47+ messages in thread From: Tomasz Grobelny @ 2008-04-14 23:45 UTC (permalink / raw) To: Gerrit Renker, Arnaldo Carvalho de Melo, dccp, netdev Dnia Monday 14 of April 2008, Gerrit Renker napisał: > [DCCP] [RFC]: Reworked queueing policy patch set > > This is a re-worked version of Tomasz's priority-queueing patch set for > DCCP. The essential internals of his patch set have been left untouched, > in particular the prio/simple algorithms are the same as before. > > The changes for which I am asking for comments are: > > Given that each file has less than 100 lines, it seemed better to put > them all into one file and export the declarations in dccp.h, the result is > now: > > With less than 150 lines, there is still room for a few more policies. > For now the files are small but I would expect at least qpolicy_prio.c to grow a little bit in the future. What comes to mind are get/setsockopts (for example statistics about packets dropped by qpolicy), and per packet expiry times. I didn't look how big the files are but wanted to have clear separation between specific policies and common code. But if you think that this organization is better I have no problem with it. > 2. Let pop() return result > -------------------------- > * at least for the forseeable future, the underlying queue is > sk_write_queue; * an skb can not be on two lists at the same time; > * thus skb_unlink will always be part of implementing pop(); > * having pop() return the skb is more versatile (compare Assembler > pop()). > That should be fine... > * makes tx_qlen a socket member of dccp_sock (there was a bug in the > implementation -- tx_qlen was shared by all sockets using the same > policy); Haven't tested it thoroughly but it shouldn't be so. After all qpol_txqlen was part of struct dccp_qpolicy which was in turn part of dccp_sock. > +/* > + * Wrappers for methods provided by policy currently in use > + */ > +void qpolicy_push(struct sock *sk, struct sk_buff *skb, struct msghdr > *msg) +{ > + struct dccp_packet_info *dcp = NULL; > + > + if (msg->msg_control != NULL && msg->msg_controllen == sizeof(*dcp)) > + dcp = (struct dccp_packet_info *)msg->msg_control; > + > + qpol_table[dccp_sk(sk)->dccps_qpolicy].push(sk, skb, dcp); > +} > + What happens if application is compiled with dccp_packet_info containing only one field (priority) and kernel containing two fields (for example priority and later added expiry time)? Wouldn't that implementation make extensions to dccp_packet_info impossible? > @@ -501,6 +519,8 @@ struct dccp_sock { > struct ccid *dccps_hc_rx_ccid; > struct ccid *dccps_hc_tx_ccid; > struct dccp_options_received dccps_options_received; > + __u8 dccps_qpolicy; > + __u32 dccps_tx_qlen; > enum dccp_role dccps_role:2; > __u8 dccps_hc_rx_insert_options:1; > __u8 dccps_hc_tx_insert_options:1; I know that currently none of the policies has any per-socket data. But if it had were should it go? > + case DCCP_SOCKOPT_QPOLICY_ID: > + if (sk->sk_state != DCCP_CLOSED) > + err = -EISCONN; What about DCCP_LISTENING state? > --- a/Documentation/networking/dccp.txt > +++ b/Documentation/networking/dccp.txt > @@ -45,6 +45,17 @@ http://linux-net.osdl.org/index.php/DCCP > > +DCCP_SOCKOPT_QPOLICY_TXQLEN sets the maximum TX queue length. This is > relevant only for +those TX dequeueing policies that honour this upper > bound (currently only the +simple/default policy does this). This socket > option overrides the default value +of the sysctl > /proc/sys/net/dccp/default/tx_qlen. > In my implementation prio policy did honour DCCP_SOCKOPT_QPOLICY_TXQLEN setting. At least it was meant to do so. Having said "ok" here and there or nothing in other points I haven't tried applying this patch nor tested your rework. -- Regards, Tomasz Grobelny ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-14 23:45 ` Tomasz Grobelny @ 2008-04-15 15:14 ` Gerrit Renker 2008-04-15 15:21 ` Gerrit Renker 2008-04-15 19:38 ` Tomasz Grobelny 0 siblings, 2 replies; 47+ messages in thread From: Gerrit Renker @ 2008-04-15 15:14 UTC (permalink / raw) To: Tomasz Grobelny; +Cc: Arnaldo Carvalho de Melo, dccp, netdev | > * makes tx_qlen a socket member of dccp_sock (there was a bug in the | > implementation -- tx_qlen was shared by all sockets using the same | > policy); | Haven't tested it thoroughly but it shouldn't be so. After all qpol_txqlen was | part of struct dccp_qpolicy which was in turn part of dccp_sock. | I have to apologise here -- this was my oversight. The bug I assumed to be there does not exist, since the tx_qlen was indeed part of the socket. Sorry. | > +/* | > + * Wrappers for methods provided by policy currently in use | > + */ | > +void qpolicy_push(struct sock *sk, struct sk_buff *skb, struct msghdr | > *msg) +{ | > + struct dccp_packet_info *dcp = NULL; | > + | > + if (msg->msg_control != NULL && msg->msg_controllen == sizeof(*dcp)) | > + dcp = (struct dccp_packet_info *)msg->msg_control; | > + | > + qpol_table[dccp_sk(sk)->dccps_qpolicy].push(sk, skb, dcp); | > +} | > + | What happens if application is compiled with dccp_packet_info containing only | one field (priority) and kernel containing two fields (for example priority | and later added expiry time)? Wouldn't that implementation make extensions to | dccp_packet_info impossible? Had not looked at it from this point of view. So you are suggesting if (msg->msg_control != NULL && msg->msg_controllen >= sizeof(*dcp)) dcp = (struct dccp_packet_info *)msg->msg_control; instead? Agreed, point #1 to change in the patch. | | > @@ -501,6 +519,8 @@ struct dccp_sock { | > struct ccid *dccps_hc_rx_ccid; | > struct ccid *dccps_hc_tx_ccid; | > struct dccp_options_received dccps_options_received; | > + __u8 dccps_qpolicy; | > + __u32 dccps_tx_qlen; | > enum dccp_role dccps_role:2; | > __u8 dccps_hc_rx_insert_options:1; | > __u8 dccps_hc_tx_insert_options:1; | I know that currently none of the policies has any per-socket data. But if it | had were should it go? | I can't see anything wrong with putting everything into dccp_sock. To do it well, we could consider inserting documentation such as "this section is only for queueing policies" (as is done very well for struct tcp_sock). I see several advantages of this: * by just storing the u8 of the policy ID, the socket as a whole gets smaller, * when not using nested structs, the elements can be optimised for minimal space, * the internals of the function pointer table can be hidden, * the socket carries the same information as is passed on via socket options, which simplifies querying. | > + case DCCP_SOCKOPT_QPOLICY_ID: | > + if (sk->sk_state != DCCP_CLOSED) | > + err = -EISCONN; | What about DCCP_LISTENING state? | There is a problem with allowing this, considering for example: * socket is created, and listen() is called, * socket enters with default policy, * now user decides to change policy from default ("simple") to "prio", * before he can do that, the first packet arrives on the incoming queue, policy is inherited to child socket, * so this connection ends up with "simple" policy instead of "prio". | > +DCCP_SOCKOPT_QPOLICY_TXQLEN sets the maximum TX queue length. This is | > relevant only for +those TX dequeueing policies that honour this upper | > bound (currently only the +simple/default policy does this). This socket | > option overrides the default value +of the sysctl | > /proc/sys/net/dccp/default/tx_qlen. | > | In my implementation prio policy did honour DCCP_SOCKOPT_QPOLICY_TXQLEN | setting. At least it was meant to do so. | The documentation certainly needs more work, I just sketched something but it is not very good. So this is point #2 which needs to change. Ideally, it would have a little description of what each policy does and for which use case it is best suited. Please take your time and check through the other parts of the patch as well. I would like to accumulate the issues and then address each of the points. And also, do some testing.I haven't had time to write test code for prio policy -- is there something in your SVN repository? Gerrit The University of Aberdeen is a charity registered in Scotland, No SC013683. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-15 15:14 ` Gerrit Renker @ 2008-04-15 15:21 ` Gerrit Renker 2008-04-15 18:01 ` Tomasz Grobelny 2008-04-15 19:38 ` Tomasz Grobelny 1 sibling, 1 reply; 47+ messages in thread From: Gerrit Renker @ 2008-04-15 15:21 UTC (permalink / raw) To: Tomasz Grobelny, Arnaldo Carvalho de Melo, dccp, netdev | | What happens if application is compiled with dccp_packet_info containing only | | one field (priority) and kernel containing two fields (for example priority | | and later added expiry time)? Wouldn't that implementation make extensions to | | dccp_packet_info impossible? | Had not looked at it from this point of view. So you are suggesting | | if (msg->msg_control != NULL && msg->msg_controllen >= sizeof(*dcp)) | dcp = (struct dccp_packet_info *)msg->msg_control; | | instead? Agreed, point #1 to change in the patch. | No, the above is nonsense. If the application uses an earlier version of the API, then it needs to be recompiled, there is no way it could pretend to have an adequate size. So my take is that still the '==' is correct. But maybe this is a known problem? The University of Aberdeen is a charity registered in Scotland, No SC013683. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-15 15:21 ` Gerrit Renker @ 2008-04-15 18:01 ` Tomasz Grobelny 2008-04-16 6:20 ` Gerrit Renker 0 siblings, 1 reply; 47+ messages in thread From: Tomasz Grobelny @ 2008-04-15 18:01 UTC (permalink / raw) To: Gerrit Renker, Arnaldo Carvalho de Melo, dccp, netdev Dnia Tuesday 15 of April 2008, Gerrit Renker napisał: > | | What happens if application is compiled with dccp_packet_info > | | containing only one field (priority) and kernel containing two fields > | | (for example priority and later added expiry time)? Wouldn't that > | | implementation make extensions to dccp_packet_info impossible? > | > | Had not looked at it from this point of view. So you are suggesting > | > | if (msg->msg_control != NULL && msg->msg_controllen >= sizeof(*dcp)) > | dcp = (struct dccp_packet_info *)msg->msg_control; > | > | instead? Agreed, point #1 to change in the patch. > > No, the above is nonsense. If the application uses an earlier version of > the API, then it needs to be recompiled, there is no way it could pretend > to have an adequate size. > That would effectively stop development of any policy. And it's not about pretending the structure has different size, it's just about using only the data that is provided by the application (even though it may be incomplete). > So my take is that still the '==' is correct. But maybe this is a known > problem? > I can't see a problem with copying smaller struct to bigger one. It would work like that: 1. set default values for in-kernel structure, 2. copy min(msg_control, sizeof(struct dccp_packet_info)) bytes from userspace structure to the kernel one. In that sense the code above is in fact wrong but the idea alone should be ok. -- Regards, Tomasz Grobelny ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-15 18:01 ` Tomasz Grobelny @ 2008-04-16 6:20 ` Gerrit Renker 2008-04-16 8:36 ` [DCCP] [RFC] [Patchv3 " Gerrit Renker 2008-04-17 20:03 ` [DCCP] [RFC] [Patchv2 " Tomasz Grobelny 0 siblings, 2 replies; 47+ messages in thread From: Gerrit Renker @ 2008-04-16 6:20 UTC (permalink / raw) To: Tomasz Grobelny; +Cc: Arnaldo Carvalho de Melo, dccp, netdev | > | | What happens if application is compiled with dccp_packet_info | > | | containing only one field (priority) and kernel containing two fields | > | | (for example priority and later added expiry time)? Wouldn't that | > | | implementation make extensions to dccp_packet_info impossible? | > | | > | Had not looked at it from this point of view. So you are suggesting | > | | > | if (msg->msg_control != NULL && msg->msg_controllen >= sizeof(*dcp)) | > | dcp = (struct dccp_packet_info *)msg->msg_control; | > | | > | instead? Agreed, point #1 to change in the patch. | > | > No, the above is nonsense. If the application uses an earlier version of | > the API, then it needs to be recompiled, there is no way it could pretend | > to have an adequate size. | > | That would effectively stop development of any policy. And it's not about | pretending the structure has different size, it's just about using only the | data that is provided by the application (even though it may be incomplete). | | > So my take is that still the '==' is correct. But maybe this is a known | > problem? | > | I can't see a problem with copying smaller struct to bigger one. It would work | like that: | 1. set default values for in-kernel structure, | 2. copy min(msg_control, sizeof(struct dccp_packet_info)) bytes from userspace | structure to the kernel one. | | In that sense the code above is in fact wrong but the idea alone should be ok. | -- There is a more serious problem here: apparently no one tried to compile the `qpolicy' subtree since the changes on Monday. It fails with the BUILD_BUG_ON. I tried yesterday evening and found that there is not even a possibility of adding a single field to struct dccp_skb_cb: with the addition of the inet{,6}_skb_parm, the struct has reached its maximum size of 48 bytes (try printk("%u", sizeof(struct dccp_skb_cb));). There is a solution to this, will post a revised patch shortly. With regard to your point above: I think there is a way of keeping the framework open without bending over backwards to ensure the binary compatibility. My question for you in this regard: which policy can you think of whose priorities/preferences can not be mapped into an unsigned 32-bit integer? I think that such a number is sufficiently expressive enough to cater for a wide range of policies. It can be interpreted as * symbolic value (enum) * ascii string (similar to DCCP service code) * millisecond timeout (with room for up to 7 weeks) * microsecond timeout (with room for up to 1.19 hours) * priority values (the large range 0..2^32-1 can be partitioned) * ... ? The University of Aberdeen is a charity registered in Scotland, No SC013683. ^ permalink raw reply [flat|nested] 47+ messages in thread
* [DCCP] [RFC] [Patchv3 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-16 6:20 ` Gerrit Renker @ 2008-04-16 8:36 ` Gerrit Renker 2008-04-17 20:03 ` [DCCP] [RFC] [Patchv2 " Tomasz Grobelny 1 sibling, 0 replies; 47+ messages in thread From: Gerrit Renker @ 2008-04-16 8:36 UTC (permalink / raw) To: Tomasz Grobelny, Arnaldo Carvalho de Melo, dccp, netdev This is a reworked version of the patch set, tackling in particular: With the recent addition of the inet{,6}_skb_parm to struct dccp_skb_cb there is no more room left for any extensions -- the size of the struct now has already reached its maximum of 48 bytes. To support the queueing policy patches developed by Tomasz, this patch therefore introduces a different concept - to reuse the @priority field of struct sk_buff. This is possible since all operations are output-only and since skb->priority is set from sk_priority in net/ipv4/ip_output.c and net/ipv6/ip6_output.c. sk_priority itself is set from the SO_PRIORITY value in net/core/sock.ci, so that until the skb is forwarded to the IP layer there is no conflict. This patch set uses the skb->priority field for DCCP transport-layer queueing priorities. It does not clear the priority field when passing the skb on to the IP layer; maybe this should be added. Other changes: -------------- 1) the "prio" policy previously used the ordering "lowest priority is best", whereas the SO_PRIORITY setting underlying sk_priority uses "highest priority is best". To make the socket API a bit more intuitive, this patch reuses the SO_PRIORITY semantics for the "prio" policy -- as remarked in the documentation. 2) Consolidated the error checking for push() by * passing a struct msghdr* instead of its two control member elements, * setting the skb->priority only if the msg_controllen matches exactly the sizeof(u32), * otherwise, the priority is set to 0 (implies lowest possible priority). 3) Made naming scheme consistent (dccp_qpolicy_xxx for externally visible functions and qpolicy_<policyname>_xxx for internal functions). 4) Added a drop function - I think the simple_push() has become quite good now. So far compile-tested only. Gerrit --- Documentation/networking/dccp.txt | 15 ++++ include/linux/dccp.h | 13 +++ net/dccp/Makefile | 2 net/dccp/dccp.h | 13 +++ net/dccp/output.c | 7 - net/dccp/proto.c | 26 ++++++- net/dccp/qpolicy.c | 135 ++++++++++++++++++++++++++++++++++++++ 7 files changed, 203 insertions(+), 8 deletions(-) --- a/Documentation/networking/dccp.txt +++ b/Documentation/networking/dccp.txt @@ -45,6 +45,21 @@ http://linux-net.osdl.org/index.php/DCCP Socket options ============== +DCCP_SOCKOPT_QPOLICY_ID sets the dequeuing policy for outgoing packets. It takes +a policy ID as argument and can only be set before the connection (i.e. changes +during an established connection are not supported). Currently, two policies are +defined: the "simple" policy (DCCPQ_POLICY_SIMPLE), which does nothing special, +and a priority-based variant (DCCPQ_POLICY_PRIO). The latter allows to pass an +u32 priority value as ancillary data to sendmsg(), where higher numbers indicate +a higher packet priority (similar to SO_PRIORITY). + +DCCP_SOCKOPT_QPOLICY_TXQLEN sets the maximum length of the output queue. A zero +value is always interpreted as unbounded queue length. If different from zero, +the interpretation of this parameter depends on the current dequeuing policy +(see above): the "simple" policy will enforce a fixed queue size by returning +EAGAIN, whereas the "prio" policy enforces a fixed queue length by dropping the +lowest-priority packet first. The default value for this parameter is +initialised from /proc/sys/net/dccp/default/tx_qlen. DCCP_SOCKOPT_SERVICE sets the service. The specification mandates use of service codes (RFC 4340, sec. 8.1.2); if this socket option is not set, --- a/include/linux/dccp.h +++ b/include/linux/dccp.h @@ -195,6 +195,13 @@ enum dccp_feature_numbers { DCCPF_MAX_CCID_SPECIFIC = 255, }; +/* DCCP priorities for outgoing/queued packets */ +enum dccp_packet_dequeueing_policy { + DCCPQ_POLICY_SIMPLE, + DCCPQ_POLICY_PRIO, + DCCPQ_POLICY_MAX +}; + /* DCCP socket options */ #define DCCP_SOCKOPT_PACKET_SIZE 1 /* XXX deprecated, without effect */ #define DCCP_SOCKOPT_SERVICE 2 @@ -208,6 +215,8 @@ enum dccp_feature_numbers { #define DCCP_SOCKOPT_CCID 13 #define DCCP_SOCKOPT_TX_CCID 14 #define DCCP_SOCKOPT_RX_CCID 15 +#define DCCP_SOCKOPT_QPOLICY_ID 16 +#define DCCP_SOCKOPT_QPOLICY_TXQLEN 17 #define DCCP_SOCKOPT_CCID_RX_INFO 128 #define DCCP_SOCKOPT_CCID_TX_INFO 192 @@ -459,6 +468,8 @@ struct dccp_ackvec; * @dccps_hc_rx_ccid - CCID used for the receiver (or receiving half-connection) * @dccps_hc_tx_ccid - CCID used for the sender (or sending half-connection) * @dccps_options_received - parsed set of retrieved options + * @dccps_qpolicy - TX dequeueing policy, one of %dccp_packet_dequeueing_policy + * @dccps_tx_qlen - maximum length of the TX queue * @dccps_role - role of this sock, one of %dccp_role * @dccps_hc_rx_insert_options - receiver wants to add options when acking * @dccps_hc_tx_insert_options - sender wants to add options when sending @@ -501,6 +512,8 @@ struct dccp_sock { struct ccid *dccps_hc_rx_ccid; struct ccid *dccps_hc_tx_ccid; struct dccp_options_received dccps_options_received; + __u8 dccps_qpolicy; + __u32 dccps_tx_qlen; enum dccp_role dccps_role:2; __u8 dccps_hc_rx_insert_options:1; __u8 dccps_hc_tx_insert_options:1; --- /dev/null +++ b/net/dccp/qpolicy.c @@ -0,0 +1,135 @@ +/* + * net/dccp/qpolicy.c + * + * An implementation of the DCCP protocol + * + * Copyright (c) 2008 Tomasz Grobelny <tomasz@grobelny.oswiecenia.net> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License v2 + * as published by the Free Software Foundation. + */ +#include <asm/unaligned.h> +#include "dccp.h" + +/* + * Simple Dequeueing Policy: + * If tx_qlen is different from 0, enqueue up to tx_qlen elements. + */ +static void qpolicy_simple_push(struct sock *sk, struct sk_buff *skb) +{ + skb_queue_tail(&sk->sk_write_queue, skb); +} + +static bool qpolicy_simple_full(struct sock *sk) +{ + return dccp_sk(sk)->dccps_tx_qlen && + sk->sk_write_queue.qlen >= dccp_sk(sk)->dccps_tx_qlen; +} + +static struct sk_buff *qpolicy_simple_top(struct sock *sk) +{ + return skb_peek(&sk->sk_write_queue); +} + +/* + * Priority-based Dequeueing Policy: + * If tx_qlen is different from 0 and the queue has reached its upper bound + * of tx_qlen elements, replace older packets lowest-priority-first. + */ +static struct sk_buff *qpolicy_prio_best_skb(struct sock *sk) +{ + struct sk_buff *skb, *best = NULL; + + skb_queue_walk(&sk->sk_write_queue, skb) + if (best == NULL || skb->priority > best->priority) + best = skb; + return best; +} + +static struct sk_buff *qpolicy_prio_worst_skb(struct sock *sk) +{ + struct sk_buff *skb, *worst = NULL; + + skb_queue_walk(&sk->sk_write_queue, skb) + if (worst == NULL || skb->priority < worst->priority) + worst = skb; + return worst; +} + +static void qpolicy_prio_push(struct sock *sk, struct sk_buff *skb) +{ + qpolicy_simple_push(sk, skb); + if (qpolicy_simple_full(sk)) + dccp_qpolicy_drop(sk, qpolicy_prio_worst_skb(sk)); +} + +/* we can always push a packet into the queue => queue is never full */ +static bool qpolicy_prio_full(struct sock *sk) +{ + return false; +} + +/** + * struct dccp_qpolicy_operations - TX Packet Dequeueing Interface + * @push: add a new @skb with possibly a struct dccp_packet_info + * @full: indicates that no more packets will be admitted + * @top: peeks at whatever the queueing policy defines as top + */ +static struct dccp_qpolicy_operations { + void (*push) (struct sock *sk, struct sk_buff *skb); + bool (*full) (struct sock *sk); + struct sk_buff* (*top) (struct sock *sk); + +} qpol_table[DCCPQ_POLICY_MAX] = { + [DCCPQ_POLICY_SIMPLE] = { + .push = qpolicy_simple_push, + .full = qpolicy_simple_full, + .top = qpolicy_simple_top, + }, + [DCCPQ_POLICY_PRIO] = { + .push = qpolicy_prio_push, + .full = qpolicy_prio_full, + .top = qpolicy_prio_best_skb, + }, +}; + +/* + * Externally visible interface + */ +void dccp_qpolicy_push(struct sock *sk, struct sk_buff *skb, struct msghdr *msg) +{ + if (msg->msg_control == NULL || msg->msg_controllen != sizeof(__u32)) + skb->priority = 0; /* implies lowest-possible priority */ + else + skb->priority = get_unaligned((__u32 *)msg->msg_control); + + qpol_table[dccp_sk(sk)->dccps_qpolicy].push(sk, skb); +} + +bool dccp_qpolicy_full(struct sock *sk) +{ + return qpol_table[dccp_sk(sk)->dccps_qpolicy].full(sk); +} + +void dccp_qpolicy_drop(struct sock *sk, struct sk_buff *skb) +{ + if (skb != NULL) { + skb_unlink(skb, &sk->sk_write_queue); + kfree(skb); + } +} + +struct sk_buff *dccp_qpolicy_top(struct sock *sk) +{ + return qpol_table[dccp_sk(sk)->dccps_qpolicy].top(sk); +} + +struct sk_buff *dccp_qpolicy_pop(struct sock *sk) +{ + struct sk_buff *skb = dccp_qpolicy_top(sk); + + if (skb) + skb_unlink(skb, &sk->sk_write_queue); + return skb; +} --- a/net/dccp/dccp.h +++ b/net/dccp/dccp.h @@ -215,6 +215,19 @@ extern void dccp_reqsk_send_ack(struct s extern void dccp_send_sync(struct sock *sk, const u64 seq, const enum dccp_pkt_type pkt_type); +/* + * TX Packet Dequeueing Interface + */ +extern void dccp_qpolicy_push(struct sock *sk, struct sk_buff *skb, + struct msghdr *msg); +extern bool dccp_qpolicy_full(struct sock *sk); +extern void dccp_qpolicy_drop(struct sock *sk, struct sk_buff *skb); +extern struct sk_buff *dccp_qpolicy_top(struct sock *sk); +extern struct sk_buff *dccp_qpolicy_pop(struct sock *sk); + +/* + * TX Packet Output and TX Timers + */ extern void dccp_write_xmit(struct sock *sk); extern void dccp_write_space(struct sock *sk); extern void dccp_flush_write_queue(struct sock *sk, long *time_budget); --- a/net/dccp/Makefile +++ b/net/dccp/Makefile @@ -1,7 +1,7 @@ obj-$(CONFIG_IP_DCCP) += dccp.o dccp_ipv4.o dccp-y := ccid.o feat.o input.o minisocks.o options.o \ - output.o proto.o timer.o ackvec.o + qpolicy.o output.o proto.o timer.o ackvec.o dccp_ipv4-y := ipv4.o --- a/net/dccp/output.c +++ b/net/dccp/output.c @@ -238,7 +238,7 @@ static void dccp_xmit_packet(struct sock { int err, len; struct dccp_sock *dp = dccp_sk(sk); - struct sk_buff *skb = skb_dequeue(&sk->sk_write_queue); + struct sk_buff *skb = dccp_qpolicy_pop(sk); if (unlikely(skb == NULL)) return; @@ -328,7 +328,7 @@ void dccp_write_xmit(struct sock *sk) struct dccp_sock *dp = dccp_sk(sk); struct sk_buff *skb; - while ((skb = skb_peek(&sk->sk_write_queue))) { + while ((skb = dccp_qpolicy_top(sk))) { int rc = ccid_hc_tx_send_packet(dp->dccps_hc_tx_ccid, sk, skb); switch (ccid_packet_dequeue_eval(rc)) { @@ -342,8 +342,7 @@ void dccp_write_xmit(struct sock *sk) dccp_xmit_packet(sk); break; case CCID_PACKET_ERR: - skb_dequeue(&sk->sk_write_queue); - kfree_skb(skb); + dccp_qpolicy_drop(sk, skb); dccp_pr_debug("packet discarded due to err=%d\n", rc); } } --- a/net/dccp/proto.c +++ b/net/dccp/proto.c @@ -187,6 +187,7 @@ int dccp_init_sock(struct sock *sk, cons dp->dccps_rate_last = jiffies; dp->dccps_role = DCCP_ROLE_UNDEFINED; dp->dccps_service = DCCP_SERVICE_CODE_IS_ABSENT; + dp->dccps_tx_qlen = sysctl_dccp_tx_qlen; INIT_LIST_HEAD(&dp->dccps_featneg); /* control socket doesn't need feat nego */ @@ -540,6 +541,20 @@ static int do_dccp_setsockopt(struct soc case DCCP_SOCKOPT_RECV_CSCOV: err = dccp_setsockopt_cscov(sk, val, true); break; + case DCCP_SOCKOPT_QPOLICY_ID: + if (sk->sk_state != DCCP_CLOSED) + err = -EISCONN; + else if (val < 0 || val >= DCCPQ_POLICY_MAX) + err = -EINVAL; + else + dp->dccps_qpolicy = val; + break; + case DCCP_SOCKOPT_QPOLICY_TXQLEN: + if (val < 0) + err = -EINVAL; + else + dp->dccps_tx_qlen = val; + break; default: err = -ENOPROTOOPT; break; @@ -643,6 +658,12 @@ static int do_dccp_getsockopt(struct soc case DCCP_SOCKOPT_RECV_CSCOV: val = dp->dccps_pcrlen; break; + case DCCP_SOCKOPT_QPOLICY_ID: + val = dp->dccps_qpolicy; + break; + case DCCP_SOCKOPT_QPOLICY_TXQLEN: + val = dp->dccps_tx_qlen; + break; case 128 ... 191: return ccid_hc_rx_getsockopt(dp->dccps_hc_rx_ccid, sk, optname, len, (u32 __user *)optval, optlen); @@ -700,8 +721,7 @@ int dccp_sendmsg(struct kiocb *iocb, str lock_sock(sk); - if (sysctl_dccp_tx_qlen && - (sk->sk_write_queue.qlen >= sysctl_dccp_tx_qlen)) { + if (dccp_qpolicy_full(sk)) { rc = -EAGAIN; goto out_release; } @@ -729,7 +749,7 @@ int dccp_sendmsg(struct kiocb *iocb, str if (rc != 0) goto out_discard; - skb_queue_tail(&sk->sk_write_queue, skb); + dccp_qpolicy_push(sk, skb, msg); dccp_write_xmit(sk); out_release: release_sock(sk); The University of Aberdeen is a charity registered in Scotland, No SC013683. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-16 6:20 ` Gerrit Renker 2008-04-16 8:36 ` [DCCP] [RFC] [Patchv3 " Gerrit Renker @ 2008-04-17 20:03 ` Tomasz Grobelny 2008-04-18 10:13 ` Gerrit Renker 1 sibling, 1 reply; 47+ messages in thread From: Tomasz Grobelny @ 2008-04-17 20:03 UTC (permalink / raw) To: Gerrit Renker, Arnaldo Carvalho de Melo, dccp, netdev Dnia Wednesday 16 of April 2008, Gerrit Renker napisał: > | > | | What happens if application is compiled with dccp_packet_info > | > | | containing only one field (priority) and kernel containing two > | > | | fields (for example priority and later added expiry time)? Wouldn't > | > | | that implementation make extensions to dccp_packet_info impossible? > | > | > | > | Had not looked at it from this point of view. So you are suggesting > | > | > | > | if (msg->msg_control != NULL && msg->msg_controllen >= sizeof(*dcp)) > | > | dcp = (struct dccp_packet_info *)msg->msg_control; > | > | > | > | instead? Agreed, point #1 to change in the patch. > | > > | > No, the above is nonsense. If the application uses an earlier version > | > of the API, then it needs to be recompiled, there is no way it could > | > pretend to have an adequate size. > | > | That would effectively stop development of any policy. And it's not about > | pretending the structure has different size, it's just about using only > | the data that is provided by the application (even though it may be > | incomplete). > | > | > So my take is that still the '==' is correct. But maybe this is a known > | > problem? > | > | I can't see a problem with copying smaller struct to bigger one. It would > | work like that: > | 1. set default values for in-kernel structure, > | 2. copy min(msg_control, sizeof(struct dccp_packet_info)) bytes from > | userspace structure to the kernel one. > | > | In that sense the code above is in fact wrong but the idea alone should > | be ok. -- > > There is a more serious problem here: apparently no one tried to compile > the `qpolicy' subtree since the changes on Monday. It fails with the > BUILD_BUG_ON. > > I tried yesterday evening and found that there is not even a possibility > of adding a single field to struct dccp_skb_cb: with the addition of the > inet{,6}_skb_parm, the struct has reached its maximum size of 48 bytes > (try printk("%u", sizeof(struct dccp_skb_cb));). > > There is a solution to this, will post a revised patch shortly. > What you proposed in the patch should work ok for qpolicies for now. But sooner or later the need to add a field to struct dccp_skb_cb will arise. So maybe we should think about it now... Other possibility apart from what you proposed in the patch is creating struct dccp_skb_cb_ext, move to it some fields from struct dccp_skb_cb and in struct add a pointer to this new struct dccp_skb_cb_ext. Ok, maybe it is not the prettiest idea but in case somebody needs yet another additional field in struct dccp_skb_cb_ext it would be nice to have an idea how to do it. > With regard to your point above: I think there is a way of keeping the > framework open without bending over backwards to ensure the binary > compatibility. > While we are developing in the test tree compatibility is not important at all. But real world needs compatibility, especially if it's not that expensive. Otherwise people will get weird behaviour without any messages indicating what is wrong. And it's certainly not how code should be written. It can't be that binary package of VLC (or any other streaming server) cannot use newer kernel version because we added a new field. > My question for you in this regard: which policy can you think of whose > priorities/preferences can not be mapped into an unsigned 32-bit > integer? I think that such a number is sufficiently expressive enough > to cater for a wide range of policies. It can be interpreted as > * symbolic value (enum) > * ascii string (similar to DCCP service code) > * millisecond timeout (with room for up to 7 weeks) > * microsecond timeout (with room for up to 1.19 hours) > * priority values (the large range 0..2^32-1 can be partitioned) > * ... ? > Putting any of those values in an integer is pretty straightforward. But when you want to put both timeout and priority things get messy. Would it be possible at least to use structure like that: struct dccp_packet_info { s8 priority; u16 timeout_mantissa:10; u16 timeout_exponent:6; } ? That would still fit in 32-bit integer (so could be stored in skb->priority), but would provide a much cleaner interface. -- Regards, Tomasz Grobelny ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-17 20:03 ` [DCCP] [RFC] [Patchv2 " Tomasz Grobelny @ 2008-04-18 10:13 ` Gerrit Renker 2008-04-19 20:42 ` Tomasz Grobelny 0 siblings, 1 reply; 47+ messages in thread From: Gerrit Renker @ 2008-04-18 10:13 UTC (permalink / raw) To: Tomasz Grobelny; +Cc: Arnaldo Carvalho de Melo, dccp, netdev | > There is a more serious problem here: apparently no one tried to compile | > the `qpolicy' subtree since the changes on Monday. It fails with the | > BUILD_BUG_ON. | > | > I tried yesterday evening and found that there is not even a possibility | > of adding a single field to struct dccp_skb_cb: with the addition of the | > inet{,6}_skb_parm, the struct has reached its maximum size of 48 bytes | > (try printk("%u", sizeof(struct dccp_skb_cb));). | > | > There is a solution to this, will post a revised patch shortly. | > | What you proposed in the patch should work ok for qpolicies for now. But | sooner or later the need to add a field to struct dccp_skb_cb will arise. So | maybe we should think about it now... | Other possibility apart from what you proposed in the patch is creating struct | dccp_skb_cb_ext, move to it some fields from struct dccp_skb_cb and in struct | add a pointer to this new struct dccp_skb_cb_ext. Ok, maybe it is not the | prettiest idea but in case somebody needs yet another additional field in | struct dccp_skb_cb_ext it would be nice to have an idea how to do it. | When the patch failed to compile I thought about those alternatives. Trying to extend the dccp_skb_cb over and above what is in there will be messy, since the IPv4/v6 parameters are required by other subsystems. And the parts that are already in there are needed by DCCP: sequence/Ack number, reset code and reset data, packet type and CCVal are required information to build the packet. | > With regard to your point above: I think there is a way of keeping the | > framework open without bending over backwards to ensure the binary | > compatibility. | > | While we are developing in the test tree compatibility is not important at | all. But real world needs compatibility, especially if it's not that | expensive. Otherwise people will get weird behaviour without any messages | indicating what is wrong. And it's certainly not how code should be written. | It can't be that binary package of VLC (or any other streaming server) cannot | use newer kernel version because we added a new field. | True, but we need to get something working first. There is no point exporting an API which only works half. In this regard, another problem with the patch has arisen: the type of passing priority information will not work on 64-bit architectures in compatibility mode. Thanks to an answer by David Miller I found this out today. You can verify this problem by running the patch e.g. on a sparc64 system: no packets will be sent, sendmsg() returns with EINVAL. I am working on this and hope to have something workable tomorrow or otherwise it will be later next week. | > My question for you in this regard: which policy can you think of whose | > priorities/preferences can not be mapped into an unsigned 32-bit | > integer? I think that such a number is sufficiently expressive enough | > to cater for a wide range of policies. It can be interpreted as | > * symbolic value (enum) | > * ascii string (similar to DCCP service code) | > * millisecond timeout (with room for up to 7 weeks) | > * microsecond timeout (with room for up to 1.19 hours) | > * priority values (the large range 0..2^32-1 can be partitioned) | > * ... ? | > | Putting any of those values in an integer is pretty straightforward. But when | you want to put both timeout and priority things get messy. Would it be | possible at least to use structure like that: | struct dccp_packet_info | { | s8 priority; | u16 timeout_mantissa:10; | u16 timeout_exponent:6; | } | ? That would still fit in 32-bit integer (so could be stored in | skb->priority), but would provide a much cleaner interface. | -- Yes that is possible. It need not be a struct, the same effect can be achieved with bit shifts and bitmasks (this is the way the Ack Vectors are encoded). Semantically there is no restriction of what the u32 number represents. And it is a large space. I don't think that a larger size is required. And it makes the implementation much simpler, by reusing a field which is already there and which matches the purpose. The University of Aberdeen is a charity registered in Scotland, No SC013683. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-18 10:13 ` Gerrit Renker @ 2008-04-19 20:42 ` Tomasz Grobelny 2008-04-20 16:57 ` Arnaldo Carvalho de Melo 2008-04-22 17:30 ` Gerrit Renker 0 siblings, 2 replies; 47+ messages in thread From: Tomasz Grobelny @ 2008-04-19 20:42 UTC (permalink / raw) To: Gerrit Renker; +Cc: Arnaldo Carvalho de Melo, dccp, netdev Dnia Friday 18 of April 2008, Gerrit Renker napisał: > | > There is a more serious problem here: apparently no one tried to > | > compile the `qpolicy' subtree since the changes on Monday. It fails > | > with the BUILD_BUG_ON. > | > > | > I tried yesterday evening and found that there is not even a > | > possibility of adding a single field to struct dccp_skb_cb: with the > | > addition of the inet{,6}_skb_parm, the struct has reached its maximum > | > size of 48 bytes (try printk("%u", sizeof(struct dccp_skb_cb));). > | > > | > There is a solution to this, will post a revised patch shortly. > | > | What you proposed in the patch should work ok for qpolicies for now. But > | sooner or later the need to add a field to struct dccp_skb_cb will arise. > | So maybe we should think about it now... > | Other possibility apart from what you proposed in the patch is creating > | struct dccp_skb_cb_ext, move to it some fields from struct dccp_skb_cb > | and in struct add a pointer to this new struct dccp_skb_cb_ext. Ok, maybe > | it is not the prettiest idea but in case somebody needs yet another > | additional field in struct dccp_skb_cb_ext it would be nice to have an > | idea how to do it. > > When the patch failed to compile I thought about those alternatives. > Trying to extend the dccp_skb_cb over and above what is in there will be > messy, since the IPv4/v6 parameters are required by other subsystems. > If inet{,6}_skb_parm is used only outside DCCP code then why at all should it be placed in struct dccp_skb_cb taking up quite a lot of valuable space? Why not put it directly in struct sk_buff? Especially that it is present in struct udp_skb_cb, struct tcp_skb_cb as well. > | > With regard to your point above: I think there is a way of keeping the > | > framework open without bending over backwards to ensure the binary > | > compatibility. > | > | While we are developing in the test tree compatibility is not important > | at all. But real world needs compatibility, especially if it's not that > | expensive. Otherwise people will get weird behaviour without any messages > | indicating what is wrong. And it's certainly not how code should be > | written. It can't be that binary package of VLC (or any other streaming > | server) cannot use newer kernel version because we added a new field. > > True, but we need to get something working first. There is no point > exporting an API which only works half. > You can never assume that the API is feature complete. Even if it "works half" at the beginning (which is I guess quite normal) it should be easy to make it "work full" without breaking compatibility when it's not necessary. > In this regard, another problem with the patch has arisen: the type of > passing priority information will not work on 64-bit architectures in > compatibility mode. > > Thanks to an answer by David Miller I found this out today. You can > verify this problem by running the patch e.g. on a sparc64 system: no > packets will be sent, sendmsg() returns with EINVAL. > I don't have any 64 bit system at hand... Does David's suggestion mean that changes are only required in userspace application or on the kernel side as well? > | > My question for you in this regard: which policy can you think of whose > | > priorities/preferences can not be mapped into an unsigned 32-bit > | > integer? I think that such a number is sufficiently expressive enough > | > to cater for a wide range of policies. It can be interpreted as > | > * symbolic value (enum) > | > * ascii string (similar to DCCP service code) > | > * millisecond timeout (with room for up to 7 weeks) > | > * microsecond timeout (with room for up to 1.19 hours) > | > * priority values (the large range 0..2^32-1 can be partitioned) > | > * ... ? > | > | Putting any of those values in an integer is pretty straightforward. But > | when you want to put both timeout and priority things get messy. Would it > | be possible at least to use structure like that: > | struct dccp_packet_info > | { > | s8 priority; > | u16 timeout_mantissa:10; > | u16 timeout_exponent:6; > | } > | ? That would still fit in 32-bit integer (so could be stored in > | skb->priority), but would provide a much cleaner interface. > | -- > > Yes that is possible. It need not be a struct, the same effect can be > achieved with bit shifts and bitmasks (this is the way the Ack Vectors > are encoded). > Yes, it can be achieved by bit operations. But to me a structure is a much cleaner way. And that is especially important when designing interfaces. Final effects should be the same but when using structure it is the compiler that takes care of low level bit operations. > Semantically there is no restriction of what the u32 number represents. And > it is a large space. I don't think that a larger size is required. > Ok, for now 32 bits are enough and I can't think of the need to use more bits. I wouldn't be so sure about future though. But ok, if it turns up that we need more space in the future then we will see, there is no need to worry about it now. We can indeed assume 32 bits for now. -- Regards, Tomasz Grobelny ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-19 20:42 ` Tomasz Grobelny @ 2008-04-20 16:57 ` Arnaldo Carvalho de Melo 2008-04-20 20:12 ` Tomasz Grobelny 2008-04-22 17:30 ` Gerrit Renker 1 sibling, 1 reply; 47+ messages in thread From: Arnaldo Carvalho de Melo @ 2008-04-20 16:57 UTC (permalink / raw) To: Tomasz Grobelny; +Cc: Gerrit Renker, Arnaldo Carvalho de Melo, dccp, netdev Em Sat, Apr 19, 2008 at 10:42:32PM +0200, Tomasz Grobelny escreveu: > Dnia Friday 18 of April 2008, Gerrit Renker napisał: > > | > There is a more serious problem here: apparently no one tried to > > | > compile the `qpolicy' subtree since the changes on Monday. It fails > > | > with the BUILD_BUG_ON. > > | > > > | > I tried yesterday evening and found that there is not even a > > | > possibility of adding a single field to struct dccp_skb_cb: with the > > | > addition of the inet{,6}_skb_parm, the struct has reached its maximum > > | > size of 48 bytes (try printk("%u", sizeof(struct dccp_skb_cb));). > > | > > > | > There is a solution to this, will post a revised patch shortly. > > | > > | What you proposed in the patch should work ok for qpolicies for now. But > > | sooner or later the need to add a field to struct dccp_skb_cb will arise. > > | So maybe we should think about it now... > > | Other possibility apart from what you proposed in the patch is creating > > | struct dccp_skb_cb_ext, move to it some fields from struct dccp_skb_cb > > | and in struct add a pointer to this new struct dccp_skb_cb_ext. Ok, maybe > > | it is not the prettiest idea but in case somebody needs yet another > > | additional field in struct dccp_skb_cb_ext it would be nice to have an > > | idea how to do it. > > > > When the patch failed to compile I thought about those alternatives. > > Trying to extend the dccp_skb_cb over and above what is in there will be > > messy, since the IPv4/v6 parameters are required by other subsystems. > > > If inet{,6}_skb_parm is used only outside DCCP code then why at all should it > be placed in struct dccp_skb_cb taking up quite a lot of valuable space? Why > not put it directly in struct sk_buff? Especially that it is present in > struct udp_skb_cb, struct tcp_skb_cb as well. Because all this is used in skb->cb[], a scratchpad for protocols to use, we can go back to what we had before, that is to not reserve use for inet6?_skb_parm but be sure to zero it before passing it to IP, as we don't want IP to be confused with things being non zero there. Then we can use all its space. - Arnaldo ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-20 16:57 ` Arnaldo Carvalho de Melo @ 2008-04-20 20:12 ` Tomasz Grobelny 2008-04-21 11:45 ` Patrick McHardy 0 siblings, 1 reply; 47+ messages in thread From: Tomasz Grobelny @ 2008-04-20 20:12 UTC (permalink / raw) To: Arnaldo Carvalho de Melo; +Cc: Gerrit Renker, dccp, netdev, Patrick McHardy Dnia Sunday 20 of April 2008, Arnaldo Carvalho de Melo napisał: > Em Sat, Apr 19, 2008 at 10:42:32PM +0200, Tomasz Grobelny escreveu: > > > When the patch failed to compile I thought about those alternatives. > > > Trying to extend the dccp_skb_cb over and above what is in there will > > > be messy, since the IPv4/v6 parameters are required by other > > > subsystems. > > > > If inet{,6}_skb_parm is used only outside DCCP code then why at all > > should it be placed in struct dccp_skb_cb taking up quite a lot of > > valuable space? Why not put it directly in struct sk_buff? Especially > > that it is present in struct udp_skb_cb, struct tcp_skb_cb as well. > > Because all this is used in skb->cb[], a scratchpad for protocols to > use, we can go back to what we had before, that is to not reserve use > for inet6?_skb_parm but be sure to zero it before passing it to IP, as > we don't want IP to be confused with things being non zero there. Then > we can use all its space. > Several questions regarding this case: 1. What about SCTP? It doesn't have inet6?_skb_parm in it's structure that is stored in skb->cb. So does it contain a potential bug (that is to be fixed) or is it not needed there or what? 2. If the sole purpose of this change was to keep skb->cb zeroed then it doesn't seem to me like the right solution. Wasting about 20 bytes instead of zeroing them when needed I would consider at least weird. I understand that TCP and UDP may have enough space left but it just turned out that DCCP doesn't. 3. If it's IP layer that needs zeroes then why not clear skb->cb in IP layer? -- Regards, Tomasz Grobelny ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-20 20:12 ` Tomasz Grobelny @ 2008-04-21 11:45 ` Patrick McHardy 2008-04-21 13:12 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 47+ messages in thread From: Patrick McHardy @ 2008-04-21 11:45 UTC (permalink / raw) To: Tomasz Grobelny; +Cc: Arnaldo Carvalho de Melo, Gerrit Renker, dccp, netdev Tomasz Grobelny wrote: > Dnia Sunday 20 of April 2008, Arnaldo Carvalho de Melo napisał: >> Em Sat, Apr 19, 2008 at 10:42:32PM +0200, Tomasz Grobelny escreveu: >>>> When the patch failed to compile I thought about those alternatives. >>>> Trying to extend the dccp_skb_cb over and above what is in there will >>>> be messy, since the IPv4/v6 parameters are required by other >>>> subsystems. >>> If inet{,6}_skb_parm is used only outside DCCP code then why at all >>> should it be placed in struct dccp_skb_cb taking up quite a lot of >>> valuable space? Why not put it directly in struct sk_buff? Especially >>> that it is present in struct udp_skb_cb, struct tcp_skb_cb as well. >> Because all this is used in skb->cb[], a scratchpad for protocols to >> use, we can go back to what we had before, that is to not reserve use >> for inet6?_skb_parm but be sure to zero it before passing it to IP, as >> we don't want IP to be confused with things being non zero there. Then >> we can use all its space. >> > Several questions regarding this case: > 1. What about SCTP? It doesn't have inet6?_skb_parm in it's structure that is > stored in skb->cb. So does it contain a potential bug (that is to be fixed) > or is it not needed there or what? Judging by a quick grep, SCTP only uses the CB on input and appears to be fine. > 2. If the sole purpose of this change was to keep skb->cb zeroed then it > doesn't seem to me like the right solution. Wasting about 20 bytes instead of > zeroing them when needed I would consider at least weird. I understand that > TCP and UDP may have enough space left but it just turned out that DCCP > doesn't. It was the safest solution that late in a release. It also avoids to memset the cb unnecessarily. If the room is not enough anymore, its easy to go back to using memset. > 3. If it's IP layer that needs zeroes then why not clear skb->cb in IP layer? That would certainly work, but it adds unnecessary costs for the other protocols that don't need this. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-21 11:45 ` Patrick McHardy @ 2008-04-21 13:12 ` Arnaldo Carvalho de Melo 2008-04-21 16:17 ` Tomasz Grobelny 0 siblings, 1 reply; 47+ messages in thread From: Arnaldo Carvalho de Melo @ 2008-04-21 13:12 UTC (permalink / raw) To: Patrick McHardy; +Cc: Tomasz Grobelny, Gerrit Renker, dccp, netdev Em Mon, Apr 21, 2008 at 01:45:20PM +0200, Patrick McHardy escreveu: > Tomasz Grobelny wrote: >> Dnia Sunday 20 of April 2008, Arnaldo Carvalho de Melo napisał: >>> Em Sat, Apr 19, 2008 at 10:42:32PM +0200, Tomasz Grobelny escreveu: >>>>> When the patch failed to compile I thought about those alternatives. >>>>> Trying to extend the dccp_skb_cb over and above what is in there will >>>>> be messy, since the IPv4/v6 parameters are required by other >>>>> subsystems. >>>> If inet{,6}_skb_parm is used only outside DCCP code then why at all >>>> should it be placed in struct dccp_skb_cb taking up quite a lot of >>>> valuable space? Why not put it directly in struct sk_buff? Especially >>>> that it is present in struct udp_skb_cb, struct tcp_skb_cb as well. >>> Because all this is used in skb->cb[], a scratchpad for protocols to >>> use, we can go back to what we had before, that is to not reserve use >>> for inet6?_skb_parm but be sure to zero it before passing it to IP, as >>> we don't want IP to be confused with things being non zero there. Then >>> we can use all its space. >>> >> Several questions regarding this case: >> 1. What about SCTP? It doesn't have inet6?_skb_parm in it's structure that >> is stored in skb->cb. So does it contain a potential bug (that is to be >> fixed) or is it not needed there or what? > > Judging by a quick grep, SCTP only uses the CB on input and > appears to be fine. Thanks for checking! >> 2. If the sole purpose of this change was to keep skb->cb zeroed then it >> doesn't seem to me like the right solution. Wasting about 20 bytes instead >> of zeroing them when needed I would consider at least weird. I understand >> that TCP and UDP may have enough space left but it just turned out that >> DCCP doesn't. > > It was the safest solution that late in a release. It also > avoids to memset the cb unnecessarily. If the room is not > enough anymore, its easy to go back to using memset. Nod, if we don't need the space reserved for the lower layer protocols in DCCP it is actually the best solution, as we don't need to zero the cb again before passing it to IP, it gets zeroed at alloc_skb time and that is it. If we need the space, we have to pay the price of memset before passing to IP. The rule here is: all layers expect that the CB be zeroed, not all of it, just the part that it uses. DCCP gets it zeroed, its only fair that it passes it to IP zeroed, and hey, not all of it, just what IP uses :-) >> 3. If it's IP layer that needs zeroes then why not clear skb->cb in IP layer? > > That would certainly work, but it adds unnecessary costs for > the other protocols that don't need this. Nod again. - Arnaldo ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-21 13:12 ` Arnaldo Carvalho de Melo @ 2008-04-21 16:17 ` Tomasz Grobelny 2008-04-22 4:56 ` Patrick McHardy 2008-04-22 17:41 ` Gerrit Renker 0 siblings, 2 replies; 47+ messages in thread From: Tomasz Grobelny @ 2008-04-21 16:17 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Patrick McHardy, Gerrit Renker; +Cc: dccp, netdev Dnia Monday 21 of April 2008, Arnaldo Carvalho de Melo napisał: > Em Mon, Apr 21, 2008 at 01:45:20PM +0200, Patrick McHardy escreveu: > > Tomasz Grobelny wrote: > >> 2. If the sole purpose of this change was to keep skb->cb zeroed then it > >> doesn't seem to me like the right solution. Wasting about 20 bytes > >> instead of zeroing them when needed I would consider at least weird. I > >> understand that TCP and UDP may have enough space left but it just > >> turned out that DCCP doesn't. > > > > It was the safest solution that late in a release. It also > > avoids to memset the cb unnecessarily. If the room is not > > enough anymore, its easy to go back to using memset. > > Nod, if we don't need the space reserved for the lower layer protocols > in DCCP it is actually the best solution, as we don't need to zero the > cb again before passing it to IP, it gets zeroed at alloc_skb time and > that is it. If we need the space, we have to pay the price of > memset before passing to IP. > Ok, so in this case the patch for DCCP could be reverted in test tree, is that right? Were these two deleted memsets zeroing all that was necessary or were there any other bugs fixed by the patch? If we iron this out we could finally return to the main subject of this thread. That is Patch v2 by me and Gerrit... -- Regards, Tomasz Grobelny ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-21 16:17 ` Tomasz Grobelny @ 2008-04-22 4:56 ` Patrick McHardy 2008-04-22 20:45 ` Tomasz Grobelny 2008-04-22 17:41 ` Gerrit Renker 1 sibling, 1 reply; 47+ messages in thread From: Patrick McHardy @ 2008-04-22 4:56 UTC (permalink / raw) To: Tomasz Grobelny; +Cc: Arnaldo Carvalho de Melo, Gerrit Renker, dccp, netdev Tomasz Grobelny wrote: > Dnia Monday 21 of April 2008, Arnaldo Carvalho de Melo napisał: >> Nod, if we don't need the space reserved for the lower layer protocols >> in DCCP it is actually the best solution, as we don't need to zero the >> cb again before passing it to IP, it gets zeroed at alloc_skb time and >> that is it. If we need the space, we have to pay the price of >> memset before passing to IP. >> > Ok, so in this case the patch for DCCP could be reverted in test tree, is that > right? Were these two deleted memsets zeroing all that was necessary or were > there any other bugs fixed by the patch? No, those two memsets became unnecessary by the addition of the new cb members. If you want to remove them again, you need to add those memsets back and additionally add memsets that zero the first sizeof(inet_skb_parm)/ sizeof(inet6_skb_parm) bytes everywhere else where packets are passed to IP(v6). ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-22 4:56 ` Patrick McHardy @ 2008-04-22 20:45 ` Tomasz Grobelny 2008-04-22 22:06 ` David Miller 0 siblings, 1 reply; 47+ messages in thread From: Tomasz Grobelny @ 2008-04-22 20:45 UTC (permalink / raw) To: Patrick McHardy; +Cc: Arnaldo Carvalho de Melo, Gerrit Renker, dccp, netdev Dnia Tuesday 22 of April 2008, napisałeś: > Tomasz Grobelny wrote: > > Dnia Monday 21 of April 2008, Arnaldo Carvalho de Melo napisał: > >> Nod, if we don't need the space reserved for the lower layer protocols > >> in DCCP it is actually the best solution, as we don't need to zero the > >> cb again before passing it to IP, it gets zeroed at alloc_skb time and > >> that is it. If we need the space, we have to pay the price of > >> memset before passing to IP. > > > > Ok, so in this case the patch for DCCP could be reverted in test tree, is > > that right? Were these two deleted memsets zeroing all that was necessary > > or were there any other bugs fixed by the patch? > > No, those two memsets became unnecessary by the addition > of the new cb members. If you want to remove them again, > you need to add those memsets back and additionally add > memsets that zero the first sizeof(inet_skb_parm)/ > sizeof(inet6_skb_parm) bytes everywhere else where packets > are passed to IP(v6). Maybe I wasn't clear enough with my question. I understand that these two memsets would have to be readded. But my question is: did you identify any bugs that were caused by junk in skb->cb before applying your patch? If so, do you have any test cases? It could help to check code for correctness if one day more space in skb->cb will be needed (which of course doesn't have to be in the nearest future). -- Regards, Tomasz Grobelny ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-22 20:45 ` Tomasz Grobelny @ 2008-04-22 22:06 ` David Miller 2008-04-23 0:03 ` Patrick McHardy 0 siblings, 1 reply; 47+ messages in thread From: David Miller @ 2008-04-22 22:06 UTC (permalink / raw) To: tomasz; +Cc: kaber, acme, gerrit, dccp, netdev From: Tomasz Grobelny <tomasz@grobelny.oswiecenia.net> Date: Tue, 22 Apr 2008 22:45:53 +0200 > Maybe I wasn't clear enough with my question. I understand that > these two memsets would have to be readded. But my question is: did > you identify any bugs that were caused by junk in skb->cb before > applying your patch? If so, do you have any test cases? It could > help to check code for correctness if one day more space in skb->cb > will be needed (which of course doesn't have to be in the nearest > future). I think Patrick noticed it because it broke netfilter. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-22 22:06 ` David Miller @ 2008-04-23 0:03 ` Patrick McHardy 0 siblings, 0 replies; 47+ messages in thread From: Patrick McHardy @ 2008-04-23 0:03 UTC (permalink / raw) To: David Miller; +Cc: tomasz, acme, gerrit, dccp, netdev David Miller wrote: > From: Tomasz Grobelny <tomasz@grobelny.oswiecenia.net> > Date: Tue, 22 Apr 2008 22:45:53 +0200 > >> Maybe I wasn't clear enough with my question. I understand that >> these two memsets would have to be readded. But my question is: did >> you identify any bugs that were caused by junk in skb->cb before >> applying your patch? If so, do you have any test cases? It could >> help to check code for correctness if one day more space in skb->cb >> will be needed (which of course doesn't have to be in the nearest >> future). > > I think Patrick noticed it because it broke netfilter. > Exactly, at least one path to the IP stack didn't clear the CB (I don't recall the exact function, sorry). The same is true for IPv6. The testcase was transfering data over NATed DCCP connections, some packets had garbage in IPCB(skb)->flags, which caused IP to skip the POST_ROUTING hook, breaking NAT and conntrack. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-21 16:17 ` Tomasz Grobelny 2008-04-22 4:56 ` Patrick McHardy @ 2008-04-22 17:41 ` Gerrit Renker 2008-04-22 22:42 ` David Miller 1 sibling, 1 reply; 47+ messages in thread From: Gerrit Renker @ 2008-04-22 17:41 UTC (permalink / raw) To: Tomasz Grobelny; +Cc: dccp, netdev | If we iron this out we could finally return to the main subject of this | thread. That is Patch v2 by me and Gerrit... | -- Fully agree - we just need to decide whether or not to use skb->priority. Below is as far as I got in integrating your patch last week, it shows only the major changes. The following bits have been updated: * skb->priority now cleared before passing the skb onto layer 3; * order of statements in prio_push() reversed (first dropping worst skb and then pushing the new skb - this is better when e.g. tx_qlen=1); * added general parsing routine for cmsg(3) socket control messages and defined one for the SOL_DCCP socket level; thanks to advice by Dave Miller A new and updated version has been uploaded to git://eden-feed.erg.abdn.ac.uk/dccp_exp subtree `qpolicy' The implementation needs some more testing. I have uploaded matching userspace code to http://www.erg.abdn.ac.uk/users/gerrit/dccp/packet_prio_tests.tar.gz Will send an update next week and then we can discuss my edits, it remains your patch. Simply too busy this week. Gerrit --- a/include/linux/dccp.h +++ b/include/linux/dccp.h @@ -195,6 +195,12 @@ enum dccp_feature_numbers { DCCPF_MAX_CCID_SPECIFIC = 255, }; +/* DCCP socket control message types for cmsg */ +enum dccp_cmsg_type { + DCCP_SCM_PRIORITY = 1, + DCCP_SCM_MAX +}; + /* DCCP priorities for outgoing/queued packets */ enum dccp_packet_dequeueing_policy { DCCPQ_POLICY_SIMPLE, --- a/net/dccp/proto.c +++ b/net/dccp/proto.c @@ -706,6 +706,32 @@ int compat_dccp_getsockopt(struct sock * EXPORT_SYMBOL_GPL(compat_dccp_getsockopt); #endif +static int dccp_msghdr_parse(struct msghdr *msg, __u32 **priority) +{ + struct cmsghdr *cmsg = CMSG_FIRSTHDR(msg); + + for (; cmsg != NULL; cmsg = CMSG_NXTHDR(msg, cmsg)) { + + if (!CMSG_OK(msg, cmsg)) + return -EINVAL; + + /* Only look at DCCP-related socket control messages */ + if (cmsg->cmsg_level != SOL_DCCP) + continue; + + switch (cmsg->cmsg_type) { + case DCCP_SCM_PRIORITY: + if (cmsg->cmsg_len != CMSG_LEN(sizeof(__u32))) + return -EINVAL; + *priority = (__u32 *)CMSG_DATA(cmsg); + break; + default: + return -EINVAL; + } + } + return 0; +} + int dccp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, size_t len) { @@ -714,11 +740,15 @@ int dccp_sendmsg(struct kiocb *iocb, str const int noblock = flags & MSG_DONTWAIT; struct sk_buff *skb; int rc, size; + __u32 *pprio = NULL; long timeo; if (len > dp->dccps_mss_cache) return -EMSGSIZE; + if (dccp_msghdr_parse(msg, &pprio)) + return -EINVAL; + lock_sock(sk); if (dccp_qpolicy_full(sk)) { @@ -749,7 +779,9 @@ int dccp_sendmsg(struct kiocb *iocb, str if (rc != 0) goto out_discard; - dccp_qpolicy_push(sk, skb, msg); + skb->priority = pprio != NULL ? *pprio : 0; + + dccp_qpolicy_push(sk, skb); dccp_write_xmit(sk); out_release: release_sock(sk); --- a/net/dccp/qpolicy.c +++ b/net/dccp/qpolicy.c @@ -97,13 +102,8 @@ static struct dccp_qpolicy_operations { /* * Externally visible interface */ -void dccp_qpolicy_push(struct sock *sk, struct sk_buff *skb, struct msghdr *msg) +void dccp_qpolicy_push(struct sock *sk, struct sk_buff *skb) { - if (msg->msg_control == NULL || msg->msg_controllen != sizeof(__u32)) - skb->priority = 0; /* implies lowest-possible priority */ - else - skb->priority = get_unaligned((__u32 *)msg->msg_control); - qpol_table[dccp_sk(sk)->dccps_qpolicy].push(sk, skb); } --- a/Documentation/networking/dccp.txt +++ b/Documentation/networking/dccp.txt @@ -51,7 +51,11 @@ during an established connection are not defined: the "simple" policy (DCCPQ_POLICY_SIMPLE), which does nothing special, and a priority-based variant (DCCPQ_POLICY_PRIO). The latter allows to pass an u32 priority value as ancillary data to sendmsg(), where higher numbers indicate -a higher packet priority (similar to SO_PRIORITY). +a higher packet priority (similar to SO_PRIORITY). This ancillary data needs to +be formatted using a cmsg(3) message header filled in as follows: + cmsg->cmsg_level = SOL_DCCP; + cmsg->cmsg_type = DCCP_SCM_PRIORITY; + cmsg->cmsg_len = CMSG_LEN(sizeof(uint32_t)); DCCP_SOCKOPT_QPOLICY_TXQLEN sets the maximum length of the output queue. A zero value is always interpreted as unbounded queue length. If different from zero, --- a/net/dccp/output.c +++ b/net/dccp/output.c @@ -257,6 +257,9 @@ static void dccp_xmit_packet(struct sock DCCP_SKB_CB(skb)->dccpd_type = DCCP_PKT_DATA; } + /* Clear priority value used by the qpolicy subsystem */ + skb->priority = 0; + err = dccp_transmit_skb(sk, skb); if (err) dccp_pr_debug("transmit_skb() returned err=%d\n", err); -- The University of Aberdeen is a charity registered in Scotland, No SC013683. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-22 17:41 ` Gerrit Renker @ 2008-04-22 22:42 ` David Miller 2008-04-25 19:33 ` Tomasz Grobelny 2008-04-28 7:21 ` Gerrit Renker 0 siblings, 2 replies; 47+ messages in thread From: David Miller @ 2008-04-22 22:42 UTC (permalink / raw) To: gerrit; +Cc: tomasz, dccp, netdev From: Gerrit Renker <gerrit@erg.abdn.ac.uk> Date: Tue, 22 Apr 2008 18:41:52 +0100 > | If we iron this out we could finally return to the main subject of this > | thread. That is Patch v2 by me and Gerrit... > | -- > Fully agree - we just need to decide whether or not to use skb->priority. > > Below is as far as I got in integrating your patch last week, it shows > only the major changes. The following bits have been updated: > > * skb->priority now cleared before passing the skb onto layer 3; > * order of statements in prio_push() reversed (first dropping worst > skb and then pushing the new skb - this is better when e.g. > tx_qlen=1); > * added general parsing routine for cmsg(3) socket control messages > and defined one for the SOL_DCCP socket level; thanks to advice > by Dave Miller If this usage of skb->priority is going to override the IP_TOS socket option setting, I don't think it's a good idea. Right now every packet output goes through ip_output.c which sets skb->priority to sk->sk_priority, which is set by the user via the IP_TOS socket option in ip_sockglue.c ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-22 22:42 ` David Miller @ 2008-04-25 19:33 ` Tomasz Grobelny 2008-04-25 20:40 ` Arnaldo Carvalho de Melo 2008-04-28 7:21 ` Gerrit Renker 1 sibling, 1 reply; 47+ messages in thread From: Tomasz Grobelny @ 2008-04-25 19:33 UTC (permalink / raw) To: David Miller; +Cc: gerrit, dccp, netdev Dnia Wednesday 23 of April 2008, David Miller napisał: > From: Gerrit Renker <gerrit@erg.abdn.ac.uk> > Date: Tue, 22 Apr 2008 18:41:52 +0100 > > > | If we iron this out we could finally return to the main subject of this > > | thread. That is Patch v2 by me and Gerrit... > > | -- > > > > Fully agree - we just need to decide whether or not to use skb->priority. > > > > Below is as far as I got in integrating your patch last week, it shows > > only the major changes. The following bits have been updated: > > > > * skb->priority now cleared before passing the skb onto layer 3; > > * order of statements in prio_push() reversed (first dropping worst > > skb and then pushing the new skb - this is better when e.g. > > tx_qlen=1); > > * added general parsing routine for cmsg(3) socket control messages > > and defined one for the SOL_DCCP socket level; thanks to advice > > by Dave Miller > > If this usage of skb->priority is going to override the > IP_TOS socket option setting, I don't think it's a good > idea. > > Right now every packet output goes through ip_output.c > which sets skb->priority to sk->sk_priority, which is set > by the user via the IP_TOS socket option in ip_sockglue.c > But I guess this assignment happens a bit later (that is after outgoing packet leaves DCCP code). Consequently using skb->priority should not harm as it will be overwritten. Or did I miss something? -- Regards, Tomasz Grobelny ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-25 19:33 ` Tomasz Grobelny @ 2008-04-25 20:40 ` Arnaldo Carvalho de Melo 2008-04-25 20:58 ` David Miller 0 siblings, 1 reply; 47+ messages in thread From: Arnaldo Carvalho de Melo @ 2008-04-25 20:40 UTC (permalink / raw) To: Tomasz Grobelny; +Cc: David Miller, gerrit, dccp, netdev Em Fri, Apr 25, 2008 at 09:33:11PM +0200, Tomasz Grobelny escreveu: > Dnia Wednesday 23 of April 2008, David Miller napisał: > > From: Gerrit Renker <gerrit@erg.abdn.ac.uk> > > Date: Tue, 22 Apr 2008 18:41:52 +0100 > > > > > | If we iron this out we could finally return to the main subject of this > > > | thread. That is Patch v2 by me and Gerrit... > > > | -- > > > > > > Fully agree - we just need to decide whether or not to use skb->priority. > > > > > > Below is as far as I got in integrating your patch last week, it shows > > > only the major changes. The following bits have been updated: > > > > > > * skb->priority now cleared before passing the skb onto layer 3; > > > * order of statements in prio_push() reversed (first dropping worst > > > skb and then pushing the new skb - this is better when e.g. > > > tx_qlen=1); > > > * added general parsing routine for cmsg(3) socket control messages > > > and defined one for the SOL_DCCP socket level; thanks to advice > > > by Dave Miller > > > > If this usage of skb->priority is going to override the > > IP_TOS socket option setting, I don't think it's a good > > idea. > > > > Right now every packet output goes through ip_output.c > > which sets skb->priority to sk->sk_priority, which is set > > by the user via the IP_TOS socket option in ip_sockglue.c > > > But I guess this assignment happens a bit later (that is after outgoing packet > leaves DCCP code). Consequently using skb->priority should not harm as it > will be overwritten. Or did I miss something? I haven't read all the patches, but I guess Tomasz is on the safe side as the intended skb->priority usage is limited to DCCP, when IP is handed the skb it can do as it pleases with skb->priority. - Arnaldo ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-25 20:40 ` Arnaldo Carvalho de Melo @ 2008-04-25 20:58 ` David Miller 0 siblings, 0 replies; 47+ messages in thread From: David Miller @ 2008-04-25 20:58 UTC (permalink / raw) To: acme; +Cc: tomasz, gerrit, dccp, netdev From: Arnaldo Carvalho de Melo <acme@redhat.com> Date: Fri, 25 Apr 2008 17:40:24 -0300 > I haven't read all the patches, but I guess Tomasz is on the safe side > as the intended skb->priority usage is limited to DCCP, when IP is > handed the skb it can do as it pleases with skb->priority. Great. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-22 22:42 ` David Miller 2008-04-25 19:33 ` Tomasz Grobelny @ 2008-04-28 7:21 ` Gerrit Renker 2008-04-28 7:39 ` David Miller 1 sibling, 1 reply; 47+ messages in thread From: Gerrit Renker @ 2008-04-28 7:21 UTC (permalink / raw) To: David Miller; +Cc: tomasz, dccp, netdev | > * skb->priority now cleared before passing the skb onto layer 3; <snip> | | If this usage of skb->priority is going to override the | IP_TOS socket option setting, I don't think it's a good | idea. | | Right now every packet output goes through ip_output.c | which sets skb->priority to sk->sk_priority, which is set | by the user via the IP_TOS socket option in ip_sockglue.c That is actually the key point I wanted to ask about. The patch exploits that the assignment happens later, on layer3, with these conditions: * Output path only (sent by the DCCP output module on this host). * It is not clear which value skb->priority has until reaching the output functions in ipv4/ip_output.c and ipv6/ip6_output.c. There is no memset on this field in net/core/skbuff.c and the skb comes from a cache, so presumably skb->priority is undefined until it is assigned from sk->sk_priority. * The assignment for appletalk (net/appletalk/aarp.c), econet, IPv4/6 raw sockets, packet sockets (af_packet.c) is similar, so the assumption was that the 4 bytes of skb->priority are not used while the skb is on layer 4. * The only place where I can see mischief happen is when there is a fork where the skb is not passed on to layer 3, using either __copy_skb_header (copies the priority field) or skb_segment. Since I can currently not see the last point happen within DCCP, I think that the use of skb->priority for layer 4 (DCCP) is safe. So the remaining question was whether maintainers would be okay with this "overloading" of the field (Arnaldo seems ok with it). Part of the patch tried to make this use obvious, by aligning the "prio" policy with the semantics of SO_PRIORITY. Gerrit The University of Aberdeen is a charity registered in Scotland, No SC013683. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-28 7:21 ` Gerrit Renker @ 2008-04-28 7:39 ` David Miller 0 siblings, 0 replies; 47+ messages in thread From: David Miller @ 2008-04-28 7:39 UTC (permalink / raw) To: gerrit; +Cc: tomasz, dccp, netdev From: Gerrit Renker <gerrit@erg.abdn.ac.uk> Date: Mon, 28 Apr 2008 08:21:28 +0100 > Since I can currently not see the last point happen within DCCP, I think > that the use of skb->priority for layer 4 (DCCP) is safe. Ok. > So the remaining question was whether maintainers would be okay with > this "overloading" of the field (Arnaldo seems ok with it). I'm okay with it, thanks for the explanation and analysis. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-19 20:42 ` Tomasz Grobelny 2008-04-20 16:57 ` Arnaldo Carvalho de Melo @ 2008-04-22 17:30 ` Gerrit Renker 2008-04-22 20:30 ` Tomasz Grobelny 1 sibling, 1 reply; 47+ messages in thread From: Gerrit Renker @ 2008-04-22 17:30 UTC (permalink / raw) To: Tomasz Grobelny; +Cc: Arnaldo Carvalho de Melo, dccp, netdev | If inet{,6}_skb_parm is used only outside DCCP code then why at all should it | be placed in struct dccp_skb_cb taking up quite a lot of valuable space? Why | not put it directly in struct sk_buff? Especially that it is present in | struct udp_skb_cb, struct tcp_skb_cb as well. | With or without inet{,6}_sbk_parm there are a further unused 32 bits in the cb -- by restricting the 48-bit sequence/ack numbers from their current u64 type to 48 bit only. This requires to change the parts which rely on DCCP_PKT_WITHOUT_ACK_SEQ, but that seems possible to do. So we can reduce this to the other question "is 32 bit enough for priority information". | > | While we are developing in the test tree compatibility is not important | > | at all. But real world needs compatibility, especially if it's not that | > | expensive. Otherwise people will get weird behaviour without any messages | > | indicating what is wrong. And it's certainly not how code should be | > | written. It can't be that binary package of VLC (or any other streaming | > | server) cannot use newer kernel version because we added a new field. | > | > True, but we need to get something working first. There is no point | > exporting an API which only works half. | > | You can never assume that the API is feature complete. Even if it "works half" | at the beginning (which is I guess quite normal) it should be easy to make | it "work full" without breaking compatibility when it's not necessary. | No I don't agree. Things that work half are worse than constant failures, since it is never clear when they will fail next. But maybe you didn't mean that. | > In this regard, another problem with the patch has arisen: the type of | > passing priority information will not work on 64-bit architectures in | > compatibility mode. | > | > Thanks to an answer by David Miller I found this out today. You can | > verify this problem by running the patch e.g. on a sparc64 system: no | > packets will be sent, sendmsg() returns with EINVAL. | > | I don't have any 64 bit system at hand... | Does David's suggestion mean that changes are only required in userspace | application or on the kernel side as well? | In both -- I will send an update in reply to this message and a link to some user code. Very busy right now. | > | struct dccp_packet_info | > | { | > | s8 priority; | > | u16 timeout_mantissa:10; | > | u16 timeout_exponent:6; | > | } | > | ? That would still fit in 32-bit integer (so could be stored in | > | skb->priority), but would provide a much cleaner interface. | > | -- | > | > Yes that is possible. It need not be a struct, the same effect can be | > achieved with bit shifts and bitmasks (this is the way the Ack Vectors | > are encoded). | > | Yes, it can be achieved by bit operations. But to me a structure is a much | cleaner way. And that is especially important when designing interfaces. | Final effects should be the same but when using structure it is the compiler | that takes care of low level bit operations. | You are free to use what you think is best. But I think both ways are equivalent, using e.g. typecasts to get from one to the other. I.e. I think we have no disagreement here. | > Semantically there is no restriction of what the u32 number represents. And | > it is a large space. I don't think that a larger size is required. | > | Ok, for now 32 bits are enough and I can't think of the need to use more bits. | I wouldn't be so sure about future though. But ok, if it turns up that we | need more space in the future then we will see, there is no need to worry | about it now. We can indeed assume 32 bits for now. | -- So it only remains to see whether to use a field in dccp_skb_cb, or to use skb->priority. I think the latter has advantages, since this field is only set/used for the SO_PRIORITY field (in socket(7)). The University of Aberdeen is a charity registered in Scotland, No SC013683. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-22 17:30 ` Gerrit Renker @ 2008-04-22 20:30 ` Tomasz Grobelny [not found] ` <20080424220704.0483DBC12@poczta.oswiecenia.net> 2008-04-28 13:10 ` [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set Gerrit Renker 0 siblings, 2 replies; 47+ messages in thread From: Tomasz Grobelny @ 2008-04-22 20:30 UTC (permalink / raw) To: Gerrit Renker, Arnaldo Carvalho de Melo; +Cc: dccp, netdev Dnia Tuesday 22 of April 2008, Gerrit Renker napisał: > | > | While we are developing in the test tree compatibility is not > | > | important at all. But real world needs compatibility, especially if > | > | it's not that expensive. Otherwise people will get weird behaviour > | > | without any messages indicating what is wrong. And it's certainly not > | > | how code should be written. It can't be that binary package of VLC > | > | (or any other streaming server) cannot use newer kernel version > | > | because we added a new field. > | > > | > True, but we need to get something working first. There is no point > | > exporting an API which only works half. > | > | You can never assume that the API is feature complete. Even if it "works > | half" at the beginning (which is I guess quite normal) it should be easy > | to make it "work full" without breaking compatibility when it's not > | necessary. > > No I don't agree. Things that work half are worse than constant failures, > since it is never clear when they will fail next. But maybe you didn't mean > that. > I mean that what we are currently discussing (only priorities) is in itself useful ("works half"). Adding more features (expiry times) would be a nice but non essential feature ("work full"). (a bit of reordering) > So we can reduce this to the other question "is 32 bit enough for priority > information". > That is an important question. But I'm afraid the answer is no (even though I thought otherwise when writing previous mail). When we want to add packet expiry times we will need a field for timestamp. Which is quite big (64 bits?). But note that it is only needed for in-kernel implemetation. Userspace can write this information on as low as 16 bits. This leads me to conclusions stated below. > | > Semantically there is no restriction of what the u32 number represents. > | > And it is a large space. I don't think that a larger size is required. > | > | Ok, for now 32 bits are enough and I can't think of the need to use more > | bits. I wouldn't be so sure about future though. But ok, if it turns up > | that we need more space in the future then we will see, there is no need > | to worry about it now. We can indeed assume 32 bits for now. > > So it only remains to see whether to use a field in dccp_skb_cb, or to > use skb->priority. I think the latter has advantages, since this field > is only set/used for the SO_PRIORITY field (in socket(7)). > I think we should distingish two things: 1. How do we pass data along with each packet from userspace to kernelspace. 2. How do we store data in in-kernel skb for internal use. I'm afraid there was no such separation in my patches and mixing those two things causes a lots misunderstandings. Ad 1. In this case we need not care about sizes etc. Clean and extensible interface is very important. We can also have very different priority data representantion from what is later stored in skb. Ad 2. In this case sizes do matter. We can use whichever fields we want. In particular we need not keep all the data passed from userspace application in one chunk but we can distribute it among arbitrary skb fields. So this could look like that: Userspace application fills in a structure: struct dccp_packet_info { s8 priority; //as it is now u16 expire_after; //up to 65s with 1ms accuracy } and passes it to sendmsg(..., struct dccp_packet_info pktinfo), which passes it to qpolicy_push which passes it to qpolicy_prio_push which does the tricky stuff with CMSG and does: skb->priority=pktinfo->priority; skb->tstamp=now()+pktinfo->expire_after; (possibly other things that could potentially store data in dccp_skb_cb) This would allow us to: 1. Use existing skb fields in a clean way. I mean according to their names (priority, tstamp). 2. Have a clean userspace interface not affected by internal kernel implementation details. What do you think of such decoupling? -- Regards, Tomasz Grobelny ^ permalink raw reply [flat|nested] 47+ messages in thread
[parent not found: <20080424220704.0483DBC12@poczta.oswiecenia.net>]
* Re: [PATCH 1/1] [DCCP][QPOLICY]: External interface changes [not found] ` <20080424220704.0483DBC12@poczta.oswiecenia.net> @ 2008-04-24 22:16 ` Tomasz Grobelny 2008-04-28 15:08 ` Gerrit Renker 1 sibling, 0 replies; 47+ messages in thread From: Tomasz Grobelny @ 2008-04-24 22:16 UTC (permalink / raw) To: Gerrit Renker; +Cc: acme, dccp, netdev Dnia Thursday 24 of April 2008, napisałeś: > + skb->priority = info.priority; Here, as David pointed out, we are not sure wheter skb->priority can be used. But that's not the point of this patch. The point is to request for comment when it comes to separation of externally visible interface and internal per skb data storage. > + printk("pushing prio: %d\n", skb->priority); And that is of course a simple mistake. -- Regards, Tomasz Grobelny ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 1/1] [DCCP][QPOLICY]: External interface changes [not found] ` <20080424220704.0483DBC12@poczta.oswiecenia.net> 2008-04-24 22:16 ` [PATCH 1/1] [DCCP][QPOLICY]: External interface changes Tomasz Grobelny @ 2008-04-28 15:08 ` Gerrit Renker 2008-04-28 21:29 ` Tomasz Grobelny 1 sibling, 1 reply; 47+ messages in thread From: Gerrit Renker @ 2008-04-28 15:08 UTC (permalink / raw) To: Tomasz Grobelny; +Cc: acme, dccp, netdev | Separate internal (in-kernel) storage of per packet data from external | interface (which is now an easily extensible structure). | I can see your point, there is an alternative to realise what you are aiming to do, suggestions/comments are below. | --- a/include/linux/dccp.h | +++ b/include/linux/dccp.h | @@ -98,6 +98,15 @@ struct dccp_hdr_reset { | dccph_reset_data[3]; | }; | | +/** | + * struct dccp_qpolicy_prio_info - information used by prio queuing policy | + * | + * @priority - packet priority (bigger value sent first) | + */ | +struct dccp_qpolicy_prio_info { | + __s8 priority; | +}; | + | enum dccp_pkt_type { | DCCP_PKT_REQUEST = 0, | DCCP_PKT_RESPONSE, | --- a/net/dccp/proto.c | +++ b/net/dccp/proto.c | @@ -706,7 +706,7 @@ int compat_dccp_getsockopt(struct sock *sk, int level, int optname, | EXPORT_SYMBOL_GPL(compat_dccp_getsockopt); | #endif | | -static int dccp_msghdr_parse(struct msghdr *msg, __u32 **priority) | +static int dccp_msghdr_parse(struct msghdr *msg, struct cmsghdr **cmsg_qpolicy) | { | struct cmsghdr *cmsg = CMSG_FIRSTHDR(msg); | | @@ -721,9 +721,7 @@ static int dccp_msghdr_parse(struct msghdr *msg, __u32 **priority) | | switch (cmsg->cmsg_type) { | case DCCP_SCM_PRIORITY: | - if (cmsg->cmsg_len != CMSG_LEN(sizeof(__u32))) | - return -EINVAL; | - *priority = (__u32 *)CMSG_DATA(cmsg); | + *cmsg_qpolicy = cmsg; | break; | default: | return -EINVAL; Here the cmsg is exported to any current qpolicy. The disadvantages are that * there is no longer any length check, the functions using this interface have in the worst case no idea how much the user wanted to pass, so this is dangerous (compare also RFC 3542, 20.2); * the separation-of-concerns becomes unclear: now part of the parsing is done in dccp_msg_hdr_parse() and another part in some of the qpolicy_xxx_push() routines, so one routine does part of the work of another; * each qpolicy_xxx_push() routine needs to have a cmsghdr* argument even if it is not used. It is however possible to reach the same goal with just a small modification: * so far only one SCM message (DCCP_SCM_PRIORITY) has been defined; * maintainers generally accepted the use of skb->priority while on layer 4; * hence for all sub-variants of a "prio" policy (strict/partial orderings), one can use DCCP_SCM_PRIORITY, in combination with skb->priority; * for different policies, define a different DCCP_SCM_xxx message and store the cmsg data in a different field (to be decided); * this has the advantage that type/length checking is all in one place, so that the qpolicy routines need to do qpolicy only. | --- a/net/dccp/qpolicy.c | +++ b/net/dccp/qpolicy.c | @@ -57,8 +58,13 @@ static struct sk_buff *qpolicy_prio_worst_skb(struct sock *sk) | return worst; | } | | -static void qpolicy_prio_push(struct sock *sk, struct sk_buff *skb) | +static void qpolicy_prio_push(struct sock *sk, struct sk_buff *skb, | + struct cmsghdr *cmsg) | { | + struct dccp_qpolicy_prio_info info; | + memcpy(&info, CMSG_DATA(cmsg), min(cmsg->cmsg_len, sizeof(info))); That is the problem I see - `info' could end up with garbled data, if the user had accidentally passed `u32' instead of CMSG_LEN(sizeof priority). The CMSG_LEN() ensures an additional check - that the user is in fact using proper encapsulation via CMSG_xxx macros and not via msg_control/msg_controllen directly. The latter leads to hard-to-debug problems: in my case, I found this works - by accident - on 32-bit Intel, but will fail on 64-bit computers in compatibility mode. The University of Aberdeen is a charity registered in Scotland, No SC013683. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 1/1] [DCCP][QPOLICY]: External interface changes 2008-04-28 15:08 ` Gerrit Renker @ 2008-04-28 21:29 ` Tomasz Grobelny 0 siblings, 0 replies; 47+ messages in thread From: Tomasz Grobelny @ 2008-04-28 21:29 UTC (permalink / raw) To: Gerrit Renker; +Cc: acme, dccp, netdev Dnia Monday 28 of April 2008, Gerrit Renker napisał: > | @@ -721,9 +721,7 @@ static int dccp_msghdr_parse(struct msghdr *msg, > | __u32 **priority) > | > | switch (cmsg->cmsg_type) { > | case DCCP_SCM_PRIORITY: > | - if (cmsg->cmsg_len != CMSG_LEN(sizeof(__u32))) > | - return -EINVAL; > | - *priority = (__u32 *)CMSG_DATA(cmsg); > | + *cmsg_qpolicy = cmsg; > | break; > | default: > | return -EINVAL; > > Here the cmsg is exported to any current qpolicy. The disadvantages > are that * there is no longer any length check, the functions using this > interface have in the worst case no idea how much the user wanted to pass, > so this is dangerous (compare also RFC 3542, 20.2); The information about size can be retrieved in specific qpolicy. It is not CMSG_DATA(cmsg) that is passed to qpolicy but cmsg itself along with cmsg_len field. > * the separation-of-concerns becomes unclear: now part of the parsing is > done in dccp_msg_hdr_parse() and another part in some of the > qpolicy_xxx_push() routines, so one routine does part of the work of > another; That is a problem. So I think all the code should be moved to qpolicy_prio_push(). > * each qpolicy_xxx_push() routine needs to have a cmsghdr* argument even > if it is not used. > Before the change I could say: why call dccp_msghdr_parse() if choosen policy cannot make use of it's result. No matter what we will always have either "unnecessary code", "unnecessary parameter" or something else that is "unnecessary" simply because simple policy is, well... simple. But it has to be so to keep interfaces clean. > It is however possible to reach the same goal with just a small > modification: * so far only one SCM message (DCCP_SCM_PRIORITY) has been > defined; * maintainers generally accepted the use of skb->priority while on > layer 4; * hence for all sub-variants of a "prio" policy (strict/partial > orderings), one can use DCCP_SCM_PRIORITY, in combination with > skb->priority; * for different policies, define a different DCCP_SCM_xxx > message and store the cmsg data in a different field (to be decided); > * this has the advantage that type/length checking is all in one place, > so that the qpolicy routines need to do qpolicy only. > I would use different DCCP_SCM_xxx for different parameters. So that one qpolicy could use more than one DCCP_SCM_xxx. And parsing of specific DCCP_SCM_xxx should IMHO happen in qpolicy_*_push(). > | --- a/net/dccp/qpolicy.c > | +++ b/net/dccp/qpolicy.c > | @@ -57,8 +58,13 @@ static struct sk_buff *qpolicy_prio_worst_skb(struct > | sock *sk) return worst; > | } > | > | -static void qpolicy_prio_push(struct sock *sk, struct sk_buff *skb) > | +static void qpolicy_prio_push(struct sock *sk, struct sk_buff *skb, > | + struct cmsghdr *cmsg) > | { > | + struct dccp_qpolicy_prio_info info; > | + memcpy(&info, CMSG_DATA(cmsg), min(cmsg->cmsg_len, sizeof(info))); > > That is the problem I see - `info' could end up with garbled data, if the > user had accidentally passed `u32' instead of CMSG_LEN(sizeof priority). > Right, but this could easily be fixed by initializing info prior to memcpy. Then only data provided by user would override defaults. But see below. > The CMSG_LEN() ensures an additional check - that the user is in fact using > proper encapsulation via CMSG_xxx macros and not via > msg_control/msg_controllen directly. The latter leads to hard-to-debug > problems: in my case, I found this works - by accident - on 32-bit Intel, > but will fail on 64-bit computers in compatibility mode. > Then min(cmsg->cmsg_len, sizeof(info)) could be changed to min(CMSG_LEN(cmsg), sizeof(info)), I see no problem with that. That said I think that multiple DCCP_SCM_xxx per policy might be a better approach. -- Regards, Tomasz Grobelny ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-22 20:30 ` Tomasz Grobelny [not found] ` <20080424220704.0483DBC12@poczta.oswiecenia.net> @ 2008-04-28 13:10 ` Gerrit Renker 2008-04-28 15:19 ` [DCCP] [RFC] [Patchv3 " Gerrit Renker 2008-04-28 21:03 ` [DCCP] [RFC] [Patchv2 " Tomasz Grobelny 1 sibling, 2 replies; 47+ messages in thread From: Gerrit Renker @ 2008-04-28 13:10 UTC (permalink / raw) To: Tomasz Grobelny; +Cc: Arnaldo Carvalho de Melo, dccp, netdev | > | > True, but we need to get something working first. There is no point | > | > exporting an API which only works half. | > | | > | You can never assume that the API is feature complete. Even if it "works | > | half" at the beginning (which is I guess quite normal) it should be easy | > | to make it "work full" without breaking compatibility when it's not | > | necessary. | > | > No I don't agree. Things that work half are worse than constant failures, | > since it is never clear when they will fail next. But maybe you didn't mean | > that. | > | I mean that what we are currently discussing (only priorities) is in itself | useful ("works half"). Adding more features (expiry times) would be a nice | but non essential feature ("work full"). | It seems I misunderstood you, and agree that with regard to API it is better to have a little too much discussion than too little, there are many aspects to consider. | (a bit of reordering) | > So we can reduce this to the other question "is 32 bit enough for priority | > information". | > | That is an important question. But I'm afraid the answer is no (even though I | thought otherwise when writing previous mail). When we want to add packet | expiry times we will need a field for timestamp. Which is quite big (64 | bits?). But note that it is only needed for in-kernel implemetation. | Userspace can write this information on as low as 16 bits. This leads me to | conclusions stated below. | I had the same thoughts - it is easy to end up with a wide range of possible structs, numbers, etc. that are not related and require 20 different APIs. But the basic line of thought is * the queue size is always limited since afaik the amount of skbs that can be queued depends on the socket's write memory (wmem); * hence a strict priority ordering can be done even with u8 (up to 255) in not-too-extreme case * for timeouts, it is not necessary to use struct time{val,spec} (as also above); if really required, a conversion can be done by a library (but not a kernel) routine. | > So it only remains to see whether to use a field in dccp_skb_cb, or to | > use skb->priority. I think the latter has advantages, since this field | > is only set/used for the SO_PRIORITY field (in socket(7)). | > | I think we should distingish two things: | 1. How do we pass data along with each packet from userspace to kernelspace. | 2. How do we store data in in-kernel skb for internal use. <snip> | | Ad 1. In this case we need not care about sizes etc. Clean and extensible | interface is very important. We can also have very different priority data | representantion from what is later stored in skb. | Ad 2. In this case sizes do matter. We can use whichever fields we want. In | particular we need not keep all the data passed from userspace application in | one chunk but we can distribute it among arbitrary skb fields. | | So this could look like that: | Userspace application fills in a structure: | struct dccp_packet_info { | s8 priority; //as it is now | u16 expire_after; //up to 65s with 1ms accuracy | } | and passes it to sendmsg(..., struct dccp_packet_info pktinfo), which passes | it to qpolicy_push which passes it to qpolicy_prio_push which does the tricky | stuff with CMSG and does: | skb->priority=pktinfo->priority; | skb->tstamp=now()+pktinfo->expire_after; | (possibly other things that could potentially store data in dccp_skb_cb) I notice you have discovered another skb field that could be reused :) The idea looks good, if you decide to go ahead with it, please just make sure to document it (and that is a reminder for myself too). | | This would allow us to: | 1. Use existing skb fields in a clean way. I mean according to their names | (priority, tstamp). | 2. Have a clean userspace interface not affected by internal kernel | implementation details. | | What do you think of such decoupling? | -- Yes, great. The concept is good, it is some of the details that need some more work. In (2) you mention distributing the information among several skb fields. With regard to above comment, I think we need to do this with some care, to avoid interactions with other subsystems. At the moment I have no concrete idea of how to implement a timeout-based policy ... probably this is something along the line of Ian's patches and what you sketched above. Also in (2) is the size question. The bottom line is if 2^32-1 different numbers are sufficient to represent a range of policies, then we can work with skb->priority for the moment. My understanding of the above is * the per-policy data is an opaque bitstring whose only requirement is that it fits within 32 bits; * how the bitstring is interpreted depends on the chosen policy; * not necessary to always use the full 32 bits (also in your example above). I have a tidied-up version of the changes so far which will be posted later, also with comments on your patch. The University of Aberdeen is a charity registered in Scotland, No SC013683. ^ permalink raw reply [flat|nested] 47+ messages in thread
* [DCCP] [RFC] [Patchv3 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-28 13:10 ` [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set Gerrit Renker @ 2008-04-28 15:19 ` Gerrit Renker 2008-04-28 20:12 ` Tomasz Grobelny 2008-04-28 21:03 ` [DCCP] [RFC] [Patchv2 " Tomasz Grobelny 1 sibling, 1 reply; 47+ messages in thread From: Gerrit Renker @ 2008-04-28 15:19 UTC (permalink / raw) To: Tomasz Grobelny, Arnaldo Carvalho de Melo, dccp, netdev Here is a tidied-up version, the patch sent last week and stored in git://eden-feed.erg.abdn.ac.uk/dccp_exp (tree `qpolicy') was not very good, the updates are: * the skb->priority is now cleared when leaving the DCCP layer; * there is now documentation/hints about the use of skb->priority in DCCP; * there is one question I have, and that can be reverted: - qpolicy_prio_push() has been replaced by a combination of - qpolicy_prio_full() [which now does the drop-on-queue-full] and qpolicy_simple_push() [which does the FIFO-queueing] This means fewer routines, but if you don't like it, I am ok to revert it. Please do take time with the comments -- another busy week means I won't have time to respond properly before the end of the week. Gerrit --- Documentation/networking/dccp.txt | 19 +++++ include/linux/dccp.h | 19 +++++ net/dccp/Makefile | 2 net/dccp/dccp.h | 12 +++ net/dccp/output.c | 10 +-- net/dccp/proto.c | 67 +++++++++++++++++++- net/dccp/qpolicy.c | 124 ++++++++++++++++++++++++++++++++++++++ 7 files changed, 245 insertions(+), 8 deletions(-) --- a/Documentation/networking/dccp.txt +++ b/Documentation/networking/dccp.txt @@ -45,6 +45,25 @@ http://linux-net.osdl.org/index.php/DCCP Socket options ============== +DCCP_SOCKOPT_QPOLICY_ID sets the dequeuing policy for outgoing packets. It takes +a policy ID as argument and can only be set before the connection (i.e. changes +during an established connection are not supported). Currently, two policies are +defined: the "simple" policy (DCCPQ_POLICY_SIMPLE), which does nothing special, +and a priority-based variant (DCCPQ_POLICY_PRIO). The latter allows to pass an +u32 priority value as ancillary data to sendmsg(), where higher numbers indicate +a higher packet priority (similar to SO_PRIORITY). This ancillary data needs to +be formatted using a cmsg(3) message header filled in as follows: + cmsg->cmsg_level = SOL_DCCP; + cmsg->cmsg_type = DCCP_SCM_PRIORITY; + cmsg->cmsg_len = CMSG_LEN(sizeof(uint32_t)); /* or CMSG_LEN(4) */ + +DCCP_SOCKOPT_QPOLICY_TXQLEN sets the maximum length of the output queue. A zero +value is always interpreted as unbounded queue length. If different from zero, +the interpretation of this parameter depends on the current dequeuing policy +(see above): the "simple" policy will enforce a fixed queue size by returning +EAGAIN, whereas the "prio" policy enforces a fixed queue length by dropping the +lowest-priority packet first. The default value for this parameter is +initialised from /proc/sys/net/dccp/default/tx_qlen. DCCP_SOCKOPT_SERVICE sets the service. The specification mandates use of service codes (RFC 4340, sec. 8.1.2); if this socket option is not set, --- a/include/linux/dccp.h +++ b/include/linux/dccp.h @@ -195,6 +195,19 @@ enum dccp_feature_numbers { DCCPF_MAX_CCID_SPECIFIC = 255, }; +/* DCCP socket control message types for cmsg */ +enum dccp_cmsg_type { + DCCP_SCM_PRIORITY = 1, + DCCP_SCM_MAX +}; + +/* DCCP priorities for outgoing/queued packets */ +enum dccp_packet_dequeueing_policy { + DCCPQ_POLICY_SIMPLE, + DCCPQ_POLICY_PRIO, + DCCPQ_POLICY_MAX +}; + /* DCCP socket options */ #define DCCP_SOCKOPT_PACKET_SIZE 1 /* XXX deprecated, without effect */ #define DCCP_SOCKOPT_SERVICE 2 @@ -208,6 +221,8 @@ enum dccp_feature_numbers { #define DCCP_SOCKOPT_CCID 13 #define DCCP_SOCKOPT_TX_CCID 14 #define DCCP_SOCKOPT_RX_CCID 15 +#define DCCP_SOCKOPT_QPOLICY_ID 16 +#define DCCP_SOCKOPT_QPOLICY_TXQLEN 17 #define DCCP_SOCKOPT_CCID_RX_INFO 128 #define DCCP_SOCKOPT_CCID_TX_INFO 192 @@ -459,6 +474,8 @@ struct dccp_ackvec; * @dccps_hc_rx_ccid - CCID used for the receiver (or receiving half-connection) * @dccps_hc_tx_ccid - CCID used for the sender (or sending half-connection) * @dccps_options_received - parsed set of retrieved options + * @dccps_qpolicy - TX dequeueing policy, one of %dccp_packet_dequeueing_policy + * @dccps_tx_qlen - maximum length of the TX queue * @dccps_role - role of this sock, one of %dccp_role * @dccps_hc_rx_insert_options - receiver wants to add options when acking * @dccps_hc_tx_insert_options - sender wants to add options when sending @@ -501,6 +518,8 @@ struct dccp_sock { struct ccid *dccps_hc_rx_ccid; struct ccid *dccps_hc_tx_ccid; struct dccp_options_received dccps_options_received; + __u8 dccps_qpolicy; + __u32 dccps_tx_qlen; enum dccp_role dccps_role:2; __u8 dccps_hc_rx_insert_options:1; __u8 dccps_hc_tx_insert_options:1; --- a/net/dccp/proto.c +++ b/net/dccp/proto.c @@ -187,6 +187,7 @@ int dccp_init_sock(struct sock *sk, cons dp->dccps_rate_last = jiffies; dp->dccps_role = DCCP_ROLE_UNDEFINED; dp->dccps_service = DCCP_SERVICE_CODE_IS_ABSENT; + dp->dccps_tx_qlen = sysctl_dccp_tx_qlen; INIT_LIST_HEAD(&dp->dccps_featneg); /* control socket doesn't need feat nego */ @@ -540,6 +541,20 @@ static int do_dccp_setsockopt(struct soc case DCCP_SOCKOPT_RECV_CSCOV: err = dccp_setsockopt_cscov(sk, val, true); break; + case DCCP_SOCKOPT_QPOLICY_ID: + if (sk->sk_state != DCCP_CLOSED) + err = -EISCONN; + else if (val < 0 || val >= DCCPQ_POLICY_MAX) + err = -EINVAL; + else + dp->dccps_qpolicy = val; + break; + case DCCP_SOCKOPT_QPOLICY_TXQLEN: + if (val < 0) + err = -EINVAL; + else + dp->dccps_tx_qlen = val; + break; default: err = -ENOPROTOOPT; break; @@ -643,6 +658,12 @@ static int do_dccp_getsockopt(struct soc case DCCP_SOCKOPT_RECV_CSCOV: val = dp->dccps_pcrlen; break; + case DCCP_SOCKOPT_QPOLICY_ID: + val = dp->dccps_qpolicy; + break; + case DCCP_SOCKOPT_QPOLICY_TXQLEN: + val = dp->dccps_tx_qlen; + break; case 128 ... 191: return ccid_hc_rx_getsockopt(dp->dccps_hc_rx_ccid, sk, optname, len, (u32 __user *)optval, optlen); @@ -685,6 +706,32 @@ int compat_dccp_getsockopt(struct sock * EXPORT_SYMBOL_GPL(compat_dccp_getsockopt); #endif +static int dccp_msghdr_parse(struct msghdr *msg, __u32 **priority) +{ + struct cmsghdr *cmsg = CMSG_FIRSTHDR(msg); + + for (; cmsg != NULL; cmsg = CMSG_NXTHDR(msg, cmsg)) { + + if (!CMSG_OK(msg, cmsg)) + return -EINVAL; + + /* Only look at DCCP-related socket control messages */ + if (cmsg->cmsg_level != SOL_DCCP) + continue; + + switch (cmsg->cmsg_type) { + case DCCP_SCM_PRIORITY: + if (cmsg->cmsg_len != CMSG_LEN(sizeof(__u32))) + return -EINVAL; + *priority = (__u32 *)CMSG_DATA(cmsg); + break; + default: + return -EINVAL; + } + } + return 0; +} + int dccp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, size_t len) { @@ -693,15 +740,18 @@ int dccp_sendmsg(struct kiocb *iocb, str const int noblock = flags & MSG_DONTWAIT; struct sk_buff *skb; int rc, size; + __u32 *pprio = NULL; long timeo; if (len > dp->dccps_mss_cache) return -EMSGSIZE; + if (dccp_msghdr_parse(msg, &pprio)) + return -EINVAL; + lock_sock(sk); - if (sysctl_dccp_tx_qlen && - (sk->sk_write_queue.qlen >= sysctl_dccp_tx_qlen)) { + if (dccp_qpolicy_full(sk)) { rc = -EAGAIN; goto out_release; } @@ -729,7 +779,18 @@ int dccp_sendmsg(struct kiocb *iocb, str if (rc != 0) goto out_discard; - skb_queue_tail(&sk->sk_write_queue, skb); + /* + * Assign the (opaque) qpolicy priority value to skb->priority. + * + * We are overloading this skb field for use with the qpolicy subystem. + * The skb->priority is normally used for the SO_PRIORITY option, which + * is initialised from sk_priority. Since the assignment of sk_priority + * to skb->priority happens later (on layer 3), we overload this field + * for use with queueing priorities as long as the skb is on layer 4. + */ + skb->priority = pprio == NULL ? 0 : *pprio; + + dccp_qpolicy_push(sk, skb); dccp_write_xmit(sk); out_release: release_sock(sk); --- a/net/dccp/output.c +++ b/net/dccp/output.c @@ -238,7 +238,7 @@ static void dccp_xmit_packet(struct sock { int err, len; struct dccp_sock *dp = dccp_sk(sk); - struct sk_buff *skb = skb_dequeue(&sk->sk_write_queue); + struct sk_buff *skb = dccp_qpolicy_pop(sk); if (unlikely(skb == NULL)) return; @@ -257,6 +257,9 @@ static void dccp_xmit_packet(struct sock DCCP_SKB_CB(skb)->dccpd_type = DCCP_PKT_DATA; } + /* Clear priority value used by the qpolicy subsystem */ + skb->priority = 0; + err = dccp_transmit_skb(sk, skb); if (err) dccp_pr_debug("transmit_skb() returned err=%d\n", err); @@ -328,7 +331,7 @@ void dccp_write_xmit(struct sock *sk) struct dccp_sock *dp = dccp_sk(sk); struct sk_buff *skb; - while ((skb = skb_peek(&sk->sk_write_queue))) { + while ((skb = dccp_qpolicy_top(sk))) { int rc = ccid_hc_tx_send_packet(dp->dccps_hc_tx_ccid, sk, skb); switch (ccid_packet_dequeue_eval(rc)) { @@ -342,8 +345,7 @@ void dccp_write_xmit(struct sock *sk) dccp_xmit_packet(sk); break; case CCID_PACKET_ERR: - skb_dequeue(&sk->sk_write_queue); - kfree_skb(skb); + dccp_qpolicy_drop(sk, skb); dccp_pr_debug("packet discarded due to err=%d\n", rc); } } --- a/net/dccp/dccp.h +++ b/net/dccp/dccp.h @@ -215,6 +215,18 @@ extern void dccp_reqsk_send_ack(struct s extern void dccp_send_sync(struct sock *sk, const u64 seq, const enum dccp_pkt_type pkt_type); +/* + * TX Packet Dequeueing Interface + */ +extern void dccp_qpolicy_push(struct sock *sk, struct sk_buff *skb); +extern bool dccp_qpolicy_full(struct sock *sk); +extern void dccp_qpolicy_drop(struct sock *sk, struct sk_buff *skb); +extern struct sk_buff *dccp_qpolicy_top(struct sock *sk); +extern struct sk_buff *dccp_qpolicy_pop(struct sock *sk); + +/* + * TX Packet Output and TX Timers + */ extern void dccp_write_xmit(struct sock *sk); extern void dccp_write_space(struct sock *sk); extern void dccp_flush_write_queue(struct sock *sk, long *time_budget); --- /dev/null +++ b/net/dccp/qpolicy.c @@ -0,0 +1,124 @@ +/* + * net/dccp/qpolicy.c + * + * An implementation of the DCCP protocol + * + * Copyright (c) 2008 Tomasz Grobelny <tomasz@grobelny.oswiecenia.net> + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License v2 + * as published by the Free Software Foundation. + */ +#include <asm/unaligned.h> +#include "dccp.h" + +/* + * Simple Dequeueing Policy: + * If tx_qlen is different from 0, enqueue up to tx_qlen elements. + */ +static void qpolicy_simple_push(struct sock *sk, struct sk_buff *skb) +{ + skb_queue_tail(&sk->sk_write_queue, skb); +} + +static bool qpolicy_simple_full(struct sock *sk) +{ + return dccp_sk(sk)->dccps_tx_qlen && + sk->sk_write_queue.qlen >= dccp_sk(sk)->dccps_tx_qlen; +} + +static struct sk_buff *qpolicy_simple_top(struct sock *sk) +{ + return skb_peek(&sk->sk_write_queue); +} + +/* + * Priority-based Dequeueing Policy: + * If tx_qlen is different from 0 and the queue has reached its upper bound + * of tx_qlen elements, replace older packets lowest-priority-first. + */ +static struct sk_buff *qpolicy_prio_best_skb(struct sock *sk) +{ + struct sk_buff *skb, *best = NULL; + + skb_queue_walk(&sk->sk_write_queue, skb) + if (best == NULL || skb->priority > best->priority) + best = skb; + return best; +} + +static struct sk_buff *qpolicy_prio_worst_skb(struct sock *sk) +{ + struct sk_buff *skb, *worst = NULL; + + skb_queue_walk(&sk->sk_write_queue, skb) + if (worst == NULL || skb->priority < worst->priority) + worst = skb; + return worst; +} + +static bool qpolicy_prio_full(struct sock *sk) +{ + if (qpolicy_simple_full(sk)) + dccp_qpolicy_drop(sk, qpolicy_prio_worst_skb(sk)); + return false; +} + +/** + * struct dccp_qpolicy_operations - TX Packet Dequeueing Interface + * @push: add a new @skb with possibly a struct dccp_packet_info + * @full: indicates that no more packets will be admitted + * @top: peeks at whatever the queueing policy defines as its top + */ +static struct dccp_qpolicy_operations { + void (*push) (struct sock *sk, struct sk_buff *skb); + bool (*full) (struct sock *sk); + struct sk_buff* (*top) (struct sock *sk); + +} qpol_table[DCCPQ_POLICY_MAX] = { + [DCCPQ_POLICY_SIMPLE] = { + .push = qpolicy_simple_push, + .full = qpolicy_simple_full, + .top = qpolicy_simple_top, + }, + [DCCPQ_POLICY_PRIO] = { + .push = qpolicy_simple_push, + .full = qpolicy_prio_full, + .top = qpolicy_prio_best_skb, + }, +}; + +/* + * Externally visible interface + */ +void dccp_qpolicy_push(struct sock *sk, struct sk_buff *skb) +{ + qpol_table[dccp_sk(sk)->dccps_qpolicy].push(sk, skb); +} + +bool dccp_qpolicy_full(struct sock *sk) +{ + return qpol_table[dccp_sk(sk)->dccps_qpolicy].full(sk); +} + +void dccp_qpolicy_drop(struct sock *sk, struct sk_buff *skb) +{ + if (skb != NULL) { + skb_unlink(skb, &sk->sk_write_queue); + kfree_skb(skb); + } +} + +struct sk_buff *dccp_qpolicy_top(struct sock *sk) +{ + return qpol_table[dccp_sk(sk)->dccps_qpolicy].top(sk); +} + +struct sk_buff *dccp_qpolicy_pop(struct sock *sk) +{ + struct sk_buff *skb = dccp_qpolicy_top(sk); + + if (skb) + skb_unlink(skb, &sk->sk_write_queue); + return skb; +} --- a/net/dccp/Makefile +++ b/net/dccp/Makefile @@ -1,7 +1,7 @@ obj-$(CONFIG_IP_DCCP) += dccp.o dccp_ipv4.o dccp-y := ccid.o feat.o input.o minisocks.o options.o \ - output.o proto.o timer.o ackvec.o + qpolicy.o output.o proto.o timer.o ackvec.o dccp_ipv4-y := ipv4.o The University of Aberdeen is a charity registered in Scotland, No SC013683. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv3 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-28 15:19 ` [DCCP] [RFC] [Patchv3 " Gerrit Renker @ 2008-04-28 20:12 ` Tomasz Grobelny 0 siblings, 0 replies; 47+ messages in thread From: Tomasz Grobelny @ 2008-04-28 20:12 UTC (permalink / raw) To: Gerrit Renker; +Cc: Arnaldo Carvalho de Melo, dccp, netdev Dnia Monday 28 of April 2008, Gerrit Renker napisał: > Here is a tidied-up version, the patch sent last week and stored in > > git://eden-feed.erg.abdn.ac.uk/dccp_exp (tree `qpolicy') > > was not very good, the updates are: > > * the skb->priority is now cleared when leaving the DCCP layer; Ok. I only wonder if this kind of cleanup shouldn't happen in qpolicy specific function. Such as already non-existant qpolicy_prio_pop(). Note that qpolicy_simple_pop() would not need to clear skb->priority. This kind of function could act as skb deinitializer. And I think qpolicy_*_pop() should be added again... The same question applies to skb initialization. In my opinion it should happen in qpolicy_*_push(). So that dccp_msghdr_parse code should be moved there. This approach would clearly abstract qpolicy subsystem from the rest of dccp code. We would have packet initialization in qpolicy_*_push() and deinitialization in qpolicy_*_pop(). I'll try to write a patch that demonstrates it. > * there is now documentation/hints about the use of skb->priority in DCCP; There is something I don't understand in your approach regarding skb->priority. I think I could understand it better it you could sketch how in your opinion should an interface look like if packet timeout was to be added. Would you: a) pack this data into this u32 identified by DCCP_SCM_PRIORITY. This is what I thought you meant (but I may be wrong). And something I don't really like. b) add another parameter type eg. DCCP_SCM_TIMEOUT_MS and use another u32 to pass this parameter. This is something I like much better, c) something else. Then what is that? > * there is one question I have, and that can be reverted: > - qpolicy_prio_push() has been replaced by a combination of > - qpolicy_prio_full() [which now does the drop-on-queue-full] and > qpolicy_simple_push() [which does the FIFO-queueing] > This means fewer routines, but if you don't like it, I am ok to revert > it. > I don't care that much about internal implementation. But I don't really like mixing policies. Especially since when prio policy will have more features these functions will have to be separated again. When making any choices I always try to think how it will look like when more features are added (especially packet timeouts and getting statistics). -- Regards, Tomasz Grobelny ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-28 13:10 ` [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set Gerrit Renker 2008-04-28 15:19 ` [DCCP] [RFC] [Patchv3 " Gerrit Renker @ 2008-04-28 21:03 ` Tomasz Grobelny 2008-04-30 7:53 ` Gerrit Renker 1 sibling, 1 reply; 47+ messages in thread From: Tomasz Grobelny @ 2008-04-28 21:03 UTC (permalink / raw) To: Gerrit Renker; +Cc: Arnaldo Carvalho de Melo, dccp, netdev Dnia Monday 28 of April 2008, Gerrit Renker napisał: > | (a bit of reordering) > | > | > So we can reduce this to the other question "is 32 bit enough for > | > priority information". > | > | That is an important question. But I'm afraid the answer is no (even > | though I thought otherwise when writing previous mail). When we want to > | add packet expiry times we will need a field for timestamp. Which is > | quite big (64 bits?). But note that it is only needed for in-kernel > | implemetation. Userspace can write this information on as low as 16 bits. > | This leads me to conclusions stated below. > > I had the same thoughts - it is easy to end up with a wide range of > possible structs, numbers, etc. that are not related and require 20 > different APIs. For now I can think of three possible solutions: a) one structure per policy (of course only if additional data is needed), b) one structure that could contain parameters from all possible policies (that would mean that not all fields are used), c) allow qpolicy to use each cmsg header as different parameter. So that apart from DCCP_SCM_PRIORITY we could also have DCCP_SCM_TIMEOUT_MS. And this seems to be a very nice approach. It has several advantages: - extensible, - application does not need to pass parameter if it is ok with default value. It could for example pass only DCCP_SCM_TIMEOUT_MS without DCCP_SCM_PRIORITY, - it is easy to maintain compatibility. This series of parameters should be parsed by specific qpolicy that can make use of them (read: prio). Other policies (read: simple) would not even attempt to parse the list of parameters. > But the basic line of thought is > * the queue size is always limited since afaik the amount of skbs that > can be queued depends on the socket's write memory (wmem); Yes. > * hence a strict priority ordering can be done even with u8 (up to 255) > in not-too-extreme case But note that it is not only about ordering but potentially also about when (and if) packets should be dropped. And that may depend on many factors, not only ordering, but also current time, current speed, current rtt, etc. > | So this could look like that: > | Userspace application fills in a structure: > | struct dccp_packet_info { > | s8 priority; //as it is now > | u16 expire_after; //up to 65s with 1ms accuracy > | } > | and passes it to sendmsg(..., struct dccp_packet_info pktinfo), which > | passes it to qpolicy_push which passes it to qpolicy_prio_push which does > | the tricky stuff with CMSG and does: > | skb->priority=pktinfo->priority; > | skb->tstamp=now()+pktinfo->expire_after; > | (possibly other things that could potentially store data in dccp_skb_cb) > > I notice you have discovered another skb field that could be reused :) Yes :-) > | This would allow us to: > | 1. Use existing skb fields in a clean way. I mean according to their > | names (priority, tstamp). > | 2. Have a clean userspace interface not affected by internal kernel > | implementation details. > | > | What do you think of such decoupling? > | -- > > Yes, great. The concept is good, it is some of the details that need > some more work. In (2) you mention distributing the information among > several skb fields. With regard to above comment, I think we need to do > this with some care, to avoid interactions with other subsystems. > Sure. > At the moment I have no concrete idea of how to implement a timeout-based > policy ... probably this is something along the line of Ian's patches > and what you sketched above. > I think that implementing timeouts should not be in separate policy but in prio policy. Simply one more parameter would have to be passed to this policy and timed out packets would be dropped in qpolicy_prio_full() or qpolicy_prio_push() methods. The master question for now is how to pass this additional parameter. I would be in favour of second struct cmsg* with type set to DCCP_SCM_TIMEOUT_MS. > Also in (2) is the size question. The bottom line is if 2^32-1 different > numbers are sufficient to represent a range of policies, then we can > work with skb->priority for the moment. > > My understanding of the above is > * the per-policy data is an opaque bitstring whose only requirement > is that it fits within 32 bits; > * how the bitstring is interpreted depends on the chosen policy; You mean data passed between userspace and kernelspace or in-kernel storage? 1. For user to kernel data for me seems to be not expressive enough. Who knows what these 32-bits are supposed to mean? Ok, we may document that but to me it is almost like passing void*: you never know what's inside. 2. For in-kernel storage 32-bits is simply not enough for priority+timeout. -- Regards, Tomasz Grobelny ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-28 21:03 ` [DCCP] [RFC] [Patchv2 " Tomasz Grobelny @ 2008-04-30 7:53 ` Gerrit Renker 2008-05-02 20:39 ` Tomasz Grobelny 0 siblings, 1 reply; 47+ messages in thread From: Gerrit Renker @ 2008-04-30 7:53 UTC (permalink / raw) To: Tomasz Grobelny; +Cc: dccp, netdev | > | > So we can reduce this to the other question "is 32 bit enough for | > | > priority information". | > | | > | That is an important question. But I'm afraid the answer is no (even | > | though I thought otherwise when writing previous mail). When we want to | > | add packet expiry times we will need a field for timestamp. Which is | > | quite big (64 bits?). But note that it is only needed for in-kernel | > | implemetation. Userspace can write this information on as low as 16 bits. | > | This leads me to conclusions stated below. | > | > I had the same thoughts - it is easy to end up with a wide range of | > possible structs, numbers, etc. that are not related and require 20 | > different APIs. | For now I can think of three possible solutions: | a) one structure per policy (of course only if additional data is needed), | b) one structure that could contain parameters from all possible policies | (that would mean that not all fields are used), | c) allow qpolicy to use each cmsg header as different parameter. So that apart | from DCCP_SCM_PRIORITY we could also have DCCP_SCM_TIMEOUT_MS. And this seems | to be a very nice approach. It has several advantages: | - extensible, | - application does not need to pass parameter if it is ok with default value. | It could for example pass only DCCP_SCM_TIMEOUT_MS without DCCP_SCM_PRIORITY, | - it is easy to maintain compatibility. | This series of parameters should be parsed by specific qpolicy that can make | use of them (read: prio). Other policies (read: simple) would not even | attempt to parse the list of parameters. | I think (c) is best, here is what I'd support. 1. Using different DCCP_SCM_xxx is definitively okay: the cmsg_type is `int', so there is more than enough room. 2. I want to make the relationship between DCCP_SCM_xxx and the type passed as parameter explicit, i.e. - each DCCP_SCM_xxx has exactly one parameter type associated with it (e.g. DCCP_SCM_PRIORITY => uint32_t); - it is okay to associate the same parameter type with several different DCCP_SCM_xxx,but not ok to allow one DCCP_SCM_xxx to be associated with more than one parameter type or parameter length; - so that there is no confusion about the type of the parameters. 3. Parsing and interpreting parameters to be kept separate, i.e. the qpolicy routines read skb fields which have either been set from cmsg ancillary data to sendmsg() or are otherwise initialised to default values. | > Also in (2) is the size question. The bottom line is if 2^32-1 different | > numbers are sufficient to represent a range of policies, then we can | > work with skb->priority for the moment. | > | > My understanding of the above is | > * the per-policy data is an opaque bitstring whose only requirement | > is that it fits within 32 bits; | > * how the bitstring is interpreted depends on the chosen policy; | You mean data passed between userspace and kernelspace or in-kernel storage? | 1. For user to kernel data for me seems to be not expressive enough. Who knows | what these 32-bits are supposed to mean? Ok, we may document that but to me | it is almost like passing void*: you never know what's inside. To me this means that you agree with (2) above. With (2) in place, the question does not arise. | 2. For in-kernel storage 32-bits is simply not enough for priority+timeout. Ok, but this is a separate question and it is about a yet non-existing policy. So far there is a "priority" policy, for which I think we agree that 32 bits are enough. Then there is a yet non-existing "timeout" policy, which has no associated field yet. If we can assume that e.g. the skb->tstamp field can be used to store timeouts, then there are separate fields associated with separate policies. And both associated parameters can be parsed differently, in particular there is no requirement to restrict DCCP_SCM_TIMEOUT to use 32 bits - it could even pass a struct timeval or struct timespec. Thirdly, there is the aggregate policy "priority+timeout", which can then use both skb->priority and skb->tstamp. I.e. to answer the question, I think it is best to implement "timeout" first, solve the problems it brings up; when that is done, "priority+timeout" will be easy to do - it could be constructed just out of the existing functions defined for "priority" and "timeout". In that manner, other policies can be modularly constructed - for instance by combining "timeout" with a different form of the "priority" policy. The University of Aberdeen is a charity registered in Scotland, No SC013683. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-30 7:53 ` Gerrit Renker @ 2008-05-02 20:39 ` Tomasz Grobelny 2008-05-02 20:56 ` Gerrit Renker 0 siblings, 1 reply; 47+ messages in thread From: Tomasz Grobelny @ 2008-05-02 20:39 UTC (permalink / raw) To: Gerrit Renker, dccp, netdev Dnia Wednesday 30 of April 2008, Gerrit Renker napisał: > | c) allow qpolicy to use each cmsg header as different parameter. So that > (...) > I think (c) is best, here is what I'd support. > (...) Seems we are getting closer in our views, see the just sent patch. > Ok, but this is a separate question and it is about a yet non-existing > policy. > Hopefully someone, someday will implement it. So it is good to have a basic idea how it might work. > So far there is a "priority" policy, for which I think we agree that 32 > bits are enough. > For storing just priority - yes. I'd even say that it is too much... > And both associated parameters can be parsed differently, in particular > there is no requirement to restrict DCCP_SCM_TIMEOUT to use 32 bits - it > could even pass a struct timeval or struct timespec. > Yes, that's nice. > Then there is a yet non-existing "timeout" policy, which has no associated > field yet. If we can assume that e.g. the skb->tstamp field can be used > to store timeouts, then there are separate fields associated with separate > policies. > > Thirdly, there is the aggregate policy "priority+timeout", which can > then use both skb->priority and skb->tstamp. > > I.e. to answer the question, I think it is best to implement "timeout" > first, solve the problems it brings up; when that is done, > "priority+timeout" will be easy to do - it could be constructed just out of > the existing functions defined for "priority" and "timeout". > > In that manner, other policies can be modularly constructed - for instance > by combining "timeout" with a different form of the "priority" policy. > I'm not entirely sure if such modular constructions would be possible. I prefer to think of "timeout policy" and "prio policy" as a special cases of "timeout+prio policy" with respectively DCCP_SCM_PRIORITY and DCCP_SCM_TIMEOUT not supplied (and thus set to their default values: 0 and INFINITY). -- Regards, Tomasz Grobelny ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-05-02 20:39 ` Tomasz Grobelny @ 2008-05-02 20:56 ` Gerrit Renker 0 siblings, 0 replies; 47+ messages in thread From: Gerrit Renker @ 2008-05-02 20:56 UTC (permalink / raw) To: Tomasz Grobelny; +Cc: dccp, netdev | > | c) allow qpolicy to use each cmsg header as different parameter. So that | > (...) | > I think (c) is best, here is what I'd support. | > (...) | Seems we are getting closer in our views, see the just sent patch. | That would be good - what I would like to do is to replace the 12 or so patches in git://eden-feed.erg.abdn.ac.uk/dccp_exp [qpolicy subtree] with a single patch and then put it into the test tree. Will answer regarding the other patch separately and then resubmit the combined patch to the list - if you are ok with it, you can add your signed-off or point out where you disagree. It remains your patch. What I'd still like to do is some testing on different architectures. | > I.e. to answer the question, I think it is best to implement "timeout" | > first, solve the problems it brings up; when that is done, | > "priority+timeout" will be easy to do - it could be constructed just out of | > the existing functions defined for "priority" and "timeout". | > | > In that manner, other policies can be modularly constructed - for instance | > by combining "timeout" with a different form of the "priority" policy. | > | I'm not entirely sure if such modular constructions would be possible. I | prefer to think of "timeout policy" and "prio policy" as a special cases | of "timeout+prio policy" with respectively DCCP_SCM_PRIORITY and | DCCP_SCM_TIMEOUT not supplied (and thus set to their default values: 0 and | INFINITY). | -- It depends on the way one looks at it. Your view is top-down, mine is bottom-up, both can work. Agree with the parameters. The University of Aberdeen is a charity registered in Scotland, No SC013683. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-15 15:14 ` Gerrit Renker 2008-04-15 15:21 ` Gerrit Renker @ 2008-04-15 19:38 ` Tomasz Grobelny 2008-04-15 20:04 ` Arnaldo Carvalho de Melo ` (2 more replies) 1 sibling, 3 replies; 47+ messages in thread From: Tomasz Grobelny @ 2008-04-15 19:38 UTC (permalink / raw) To: Gerrit Renker, Arnaldo Carvalho de Melo, dccp, netdev Dnia Tuesday 15 of April 2008, Gerrit Renker napisał: > | > @@ -501,6 +519,8 @@ struct dccp_sock { > | > struct ccid *dccps_hc_rx_ccid; > | > struct ccid *dccps_hc_tx_ccid; > | > struct dccp_options_received dccps_options_received; > | > + __u8 dccps_qpolicy; > | > + __u32 dccps_tx_qlen; > | > enum dccp_role dccps_role:2; > | > __u8 dccps_hc_rx_insert_options:1; > | > __u8 dccps_hc_tx_insert_options:1; > | > | I know that currently none of the policies has any per-socket data. But > | if it had were should it go? > > I can't see anything wrong with putting everything into dccp_sock. To do it > well, we could consider inserting documentation such as "this section is > only for queueing policies" (as is done very well for struct tcp_sock). > Let me remind you your comment made on 18/03/2008 on dccp ml to my first patch: --- START --- @@ -545,6 +549,8 @@ struct dccp_sock { __u8 dccps_hc_tx_insert_options:1; __u8 dccps_server_timewait:1; struct timer_list dccps_xmit_timer; + struct queuing_policy *dccps_policy; + void *dccps_policy_data; }; I think this should be just one field for the policy, and the policy_data can be an internal field of `struct queueing_policy' (compare with struct ackvec or struct ccid). --- END --- > I see several advantages of this: > * by just storing the u8 of the policy ID, the socket as a whole gets > smaller, > * the internals of the function pointer table can be hidden, > That's ok with me. > * when not using nested structs, the elements can be optimised for > minimal space, I have nothing against it. In fact that's what I did in the first try. > | > + case DCCP_SOCKOPT_QPOLICY_ID: > | > + if (sk->sk_state != DCCP_CLOSED) > | > + err = -EISCONN; > | > | What about DCCP_LISTENING state? > > There is a problem with allowing this, considering for example: > (...) Ok, you are right. If before DCCP_LISTENING there is always DCCP_CLOSED then there is no need to allow changing qpolicy in DCCP_LISTENING state. > Please take your time and check through the other parts of the patch as > well. I would like to accumulate the issues and then address each of the If I didn't mention some parts of the patch you can assume that I'm ok with them. > points. And also, do some testing.I haven't had time to write test code for > prio policy -- is there something in your SVN repository? > Have a look at http://dccp.one.pl/svn/userspace/test/ And I'll also do some testing of your patch. One more question (probably more for Arnaldo): is there any timeframe to push changes from test tree upstream? -- Regards, Tomasz Grobelny ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-15 19:38 ` Tomasz Grobelny @ 2008-04-15 20:04 ` Arnaldo Carvalho de Melo 2008-04-17 20:20 ` Tomasz Grobelny 2008-04-15 20:14 ` inconsistent lock state with kernel 2.6.24.4 Bernard Pidoux 2008-04-16 7:43 ` [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set Gerrit Renker 2 siblings, 1 reply; 47+ messages in thread From: Arnaldo Carvalho de Melo @ 2008-04-15 20:04 UTC (permalink / raw) To: Tomasz Grobelny; +Cc: Gerrit Renker, Arnaldo Carvalho de Melo, dccp, netdev Em Tue, Apr 15, 2008 at 09:38:40PM +0200, Tomasz Grobelny escreveu: > Dnia Tuesday 15 of April 2008, Gerrit Renker napisał: > > | > @@ -501,6 +519,8 @@ struct dccp_sock { > > | > struct ccid *dccps_hc_rx_ccid; > > | > struct ccid *dccps_hc_tx_ccid; > > | > struct dccp_options_received dccps_options_received; > > | > + __u8 dccps_qpolicy; > > | > + __u32 dccps_tx_qlen; > > | > enum dccp_role dccps_role:2; > > | > __u8 dccps_hc_rx_insert_options:1; > > | > __u8 dccps_hc_tx_insert_options:1; > > | > > | I know that currently none of the policies has any per-socket data. But > > | if it had were should it go? > > > > I can't see anything wrong with putting everything into dccp_sock. To do it > > well, we could consider inserting documentation such as "this section is > > only for queueing policies" (as is done very well for struct tcp_sock). > > > Let me remind you your comment made on 18/03/2008 on dccp ml to my first > patch: > --- START --- > @@ -545,6 +549,8 @@ struct dccp_sock { > __u8 dccps_hc_tx_insert_options:1; > __u8 dccps_server_timewait:1; > struct timer_list dccps_xmit_timer; > + struct queuing_policy *dccps_policy; > + void *dccps_policy_data; > }; > > I think this should be just one field for the policy, and the > policy_data can be an internal field of `struct queueing_policy' > (compare with struct ackvec or struct ccid). > --- END --- > > > I see several advantages of this: > > * by just storing the u8 of the policy ID, the socket as a whole gets > > smaller, > > * the internals of the function pointer table can be hidden, > > > That's ok with me. > > > * when not using nested structs, the elements can be optimised for > > minimal space, > I have nothing against it. In fact that's what I did in the first try. > > > | > + case DCCP_SOCKOPT_QPOLICY_ID: > > | > + if (sk->sk_state != DCCP_CLOSED) > > | > + err = -EISCONN; > > | > > | What about DCCP_LISTENING state? > > > > There is a problem with allowing this, considering for example: > > (...) > Ok, you are right. If before DCCP_LISTENING there is always DCCP_CLOSED then > there is no need to allow changing qpolicy in DCCP_LISTENING state. > > > Please take your time and check through the other parts of the patch as > > well. I would like to accumulate the issues and then address each of the > If I didn't mention some parts of the patch you can assume that I'm ok with > them. > > > points. And also, do some testing.I haven't had time to write test code for > > prio policy -- is there something in your SVN repository? > > > Have a look at http://dccp.one.pl/svn/userspace/test/ And I'll also do some > testing of your patch. > > One more question (probably more for Arnaldo): is there any timeframe to push > changes from test tree upstream? Not at this moment, I'm not managing to dedicate time for another round of reviewing of what is in Gerrit's tree :-( - Arnaldo ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-15 20:04 ` Arnaldo Carvalho de Melo @ 2008-04-17 20:20 ` Tomasz Grobelny 0 siblings, 0 replies; 47+ messages in thread From: Tomasz Grobelny @ 2008-04-17 20:20 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Gerrit Renker, dccp, netdev Dnia Tuesday 15 of April 2008, Arnaldo Carvalho de Melo napisał: > > One more question (probably more for Arnaldo): is there any timeframe to > > push changes from test tree upstream? > > Not at this moment, I'm not managing to dedicate time for another round > of reviewing of what is in Gerrit's tree :-( > That's quite unfortunate because pushing patches that make use of new kernel features to other projects would be hard if not impossible... -- Regards, Tomasz Grobelny ^ permalink raw reply [flat|nested] 47+ messages in thread
* inconsistent lock state with kernel 2.6.24.4 2008-04-15 19:38 ` Tomasz Grobelny 2008-04-15 20:04 ` Arnaldo Carvalho de Melo @ 2008-04-15 20:14 ` Bernard Pidoux 2008-04-16 7:43 ` [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set Gerrit Renker 2 siblings, 0 replies; 47+ messages in thread From: Bernard Pidoux @ 2008-04-15 20:14 UTC (permalink / raw) To: Linux Netdev List Hi, Here is an inconsistent lock state detected with 2.6.24.4 kernel. I am not sure if it should be reported here or to kernel list ? ================================= [ INFO: inconsistent lock state ] 2.6.24.4 #7 --------------------------------- inconsistent {in-softirq-W} -> {softirq-on-W} usage. listen/3831 [HC0[0]:SC0[0]:HE1:SE1] takes: (_xmit_ETHER){-+..}, at: [<c0262cf6>] netpoll_send_skb+0x116/0x170 {in-softirq-W} state was registered at: [<c0139992>] __lock_acquire+0x402/0x11d0 [<c013a7e3>] lock_acquire+0x83/0xa0 [<c02badc3>] _spin_lock+0x33/0x60 [<c0265988>] __qdisc_run+0x128/0x1a0 [<c0257c6c>] dev_queue_xmit+0x25c/0x330 [<c8c24a5f>] mld_sendpack+0x2cf/0x2f0 [ipv6] [<c8c25beb>] mld_ifc_timer_expire+0x17b/0x260 [ipv6] [<c0120f69>] run_timer_softirq+0x149/0x1b0 [<c011d5c5>] __do_softirq+0x55/0xc0 [<c011d677>] do_softirq+0x47/0x50 [<c011dadc>] irq_exit+0x6c/0x80 [<c010663f>] do_IRQ+0x4f/0xa0 [<c0104bc2>] common_interrupt+0x2e/0x34 [<ffffffff>] 0xffffffff irq event stamp: 6133329 hardirqs last enabled at (6133329): [<c02bb3e7>] _spin_unlock_irqrestore+0x47/0 x60 hardirqs last disabled at (6133328): [<c02bb269>] _spin_lock_irqsave+0x19/0x70 softirqs last enabled at (6133322): [<c011d613>] __do_softirq+0xa3/0xc0 softirqs last disabled at (6133291): [<c011d677>] do_softirq+0x47/0x50 other info that might help us debug this: 2 locks held by listen/3831: #0: (&tty->atomic_write_lock){--..}, at: [<c0208b5c>] tty_write_lock+0x1c/0x50 #1: (target_list_lock){--..}, at: [<c8abe10d>] write_msg+0x2d/0xf0 [netconsole ] stack backtrace: Pid: 3831, comm: listen Not tainted 2.6.24.4 #7 [<c010513a>] show_trace_log_lvl+0x1a/0x30 [<c0105b52>] show_trace+0x12/0x20 [<c01064ac>] dump_stack+0x6c/0x80 [<c01379c6>] print_usage_bug+0x156/0x160 [<c0138ceb>] mark_lock+0x49b/0x630 [<c01399d8>] __lock_acquire+0x448/0x11d0 [<c013a7e3>] lock_acquire+0x83/0xa0 [<c02bab1a>] __lock_text_start+0x3a/0x50 [<c0262cf6>] netpoll_send_skb+0x116/0x170 [<c0263a7c>] netpoll_send_udp+0x20c/0x280 [<c8abe18c>] write_msg+0xac/0xf0 [netconsole] [<c01187f7>] __call_console_drivers+0x47/0x60 [<c0118889>] _call_console_drivers+0x79/0x90 [<c0118c27>] release_console_sem+0xd7/0x1f0 [<c02173c1>] do_con_write+0x161/0x1a20 [<c0218cc4>] con_write+0x14/0x30 [<c020b5c3>] write_chan+0x183/0x330 [<c0208d26>] tty_write+0x126/0x1b0 [<c017298f>] vfs_write+0xaf/0x120 [<c0172edd>] sys_write+0x3d/0x70 [<c010412a>] sysenter_past_esp+0x5f/0xa5 ======================= Clocksource tsc unstable (delta = 88012255 ns) Time: pit clocksource has been installed. Bernard Pidoux ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-15 19:38 ` Tomasz Grobelny 2008-04-15 20:04 ` Arnaldo Carvalho de Melo 2008-04-15 20:14 ` inconsistent lock state with kernel 2.6.24.4 Bernard Pidoux @ 2008-04-16 7:43 ` Gerrit Renker 2008-04-17 18:03 ` Tomasz Grobelny 2 siblings, 1 reply; 47+ messages in thread From: Gerrit Renker @ 2008-04-16 7:43 UTC (permalink / raw) To: Tomasz Grobelny; +Cc: Arnaldo Carvalho de Melo, dccp, netdev Quoting Tomasz Grobelny: | Dnia Tuesday 15 of April 2008, Gerrit Renker napisa?: | > | > @@ -501,6 +519,8 @@ struct dccp_sock { | > | > struct ccid *dccps_hc_rx_ccid; | > | > struct ccid *dccps_hc_tx_ccid; | > | > struct dccp_options_received dccps_options_received; | > | > + __u8 dccps_qpolicy; | > | > + __u32 dccps_tx_qlen; | > | > enum dccp_role dccps_role:2; | > | > __u8 dccps_hc_rx_insert_options:1; | > | > __u8 dccps_hc_tx_insert_options:1; | > | | > | I know that currently none of the policies has any per-socket data. But | > | if it had were should it go? | > | > I can't see anything wrong with putting everything into dccp_sock. To do it | > well, we could consider inserting documentation such as "this section is | > only for queueing policies" (as is done very well for struct tcp_sock). | > | Let me remind you your comment made on 18/03/2008 on dccp ml to my first | patch: | --- START --- | @@ -545,6 +549,8 @@ struct dccp_sock { | __u8 dccps_hc_tx_insert_options:1; | __u8 dccps_server_timewait:1; | struct timer_list dccps_xmit_timer; | + struct queuing_policy *dccps_policy; | + void *dccps_policy_data; | }; | | I think this should be just one field for the policy, and the | policy_data can be an internal field of `struct queueing_policy' | (compare with struct ackvec or struct ccid). | --- END --- | Hm, even after reading it again I still find that I don't like void* fields. It may be a personal thing, but I think using void* as part of a field is bad (and this was in an even earlier comment). I think I understand your approach better now, after going through the patches again. | Ok, you are right. If before DCCP_LISTENING there is always DCCP_CLOSED then | there is no need to allow changing qpolicy in DCCP_LISTENING state. | Yes, that's right -- the initial state is always DCCP_CLOSED, set in dccp_init_sock(). The University of Aberdeen is a charity registered in Scotland, No SC013683. ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-16 7:43 ` [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set Gerrit Renker @ 2008-04-17 18:03 ` Tomasz Grobelny 2008-04-17 18:29 ` Gerrit Renker 0 siblings, 1 reply; 47+ messages in thread From: Tomasz Grobelny @ 2008-04-17 18:03 UTC (permalink / raw) To: Gerrit Renker, Arnaldo Carvalho de Melo, dccp, netdev Dnia Wednesday 16 of April 2008, Gerrit Renker napisał: > Quoting Tomasz Grobelny: > | Dnia Tuesday 15 of April 2008, Gerrit Renker napisa?: > | > | > @@ -501,6 +519,8 @@ struct dccp_sock { > | > | > struct ccid *dccps_hc_rx_ccid; > | > | > struct ccid *dccps_hc_tx_ccid; > | > | > struct dccp_options_received dccps_options_received; > | > | > + __u8 dccps_qpolicy; > | > | > + __u32 dccps_tx_qlen; > | > | > enum dccp_role dccps_role:2; > | > | > __u8 dccps_hc_rx_insert_options:1; > | > | > __u8 dccps_hc_tx_insert_options:1; > | > | > | > | I know that currently none of the policies has any per-socket data. > | > | But if it had were should it go? > | > > | > I can't see anything wrong with putting everything into dccp_sock. To > | > do it well, we could consider inserting documentation such as "this > | > section is only for queueing policies" (as is done very well for struct > | > tcp_sock). > | > | Let me remind you your comment made on 18/03/2008 on dccp ml to my first > | patch: > | --- START --- > | @@ -545,6 +549,8 @@ struct dccp_sock { > | __u8 dccps_hc_tx_insert_options:1; > | __u8 dccps_server_timewait:1; > | struct timer_list dccps_xmit_timer; > | + struct queuing_policy *dccps_policy; > | + void *dccps_policy_data; > | }; > | > | I think this should be just one field for the policy, and the > | policy_data can be an internal field of `struct queueing_policy' > | (compare with struct ackvec or struct ccid). > | --- END --- > > Hm, even after reading it again I still find that I don't like void* > fields. It may be a personal thing, but I think using void* as part of a > field is bad (and this was in an even earlier comment). > The next paragraph was in fact about void* pointers. But the paragraph I quoted above talks only about whether those three values (policy numer/pointer, tx_qlen and possibly other data) should be put directly in struct dccp_sock or grouped in stuct queueing_policy which in turn should be one field in struct dccp_sock. In the mail from 18/03/2008 you seemed to be in favour of grouping, in the one from 15/04/2008 you seemed to contradict your earlier statement. At least that's how I understood it. But never mind, both ways are ok for me. -- Regards, Tomasz Grobelny ^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set 2008-04-17 18:03 ` Tomasz Grobelny @ 2008-04-17 18:29 ` Gerrit Renker 0 siblings, 0 replies; 47+ messages in thread From: Gerrit Renker @ 2008-04-17 18:29 UTC (permalink / raw) To: Tomasz Grobelny; +Cc: Arnaldo Carvalho de Melo, dccp, netdev | > | @@ -545,6 +549,8 @@ struct dccp_sock { | > | __u8 dccps_hc_tx_insert_options:1; | > | __u8 dccps_server_timewait:1; | > | struct timer_list dccps_xmit_timer; | > | + struct queuing_policy *dccps_policy; | > | + void *dccps_policy_data; | > | }; | > | | > | I think this should be just one field for the policy, and the | > | policy_data can be an internal field of `struct queueing_policy' | > | (compare with struct ackvec or struct ccid). | > | --- END --- | > | > Hm, even after reading it again I still find that I don't like void* | > fields. It may be a personal thing, but I think using void* as part of a | > field is bad (and this was in an even earlier comment). | > | The next paragraph was in fact about void* pointers. But the paragraph I | quoted above talks only about whether those three values (policy | numer/pointer, tx_qlen and possibly other data) should be put directly in | struct dccp_sock or grouped in stuct queueing_policy which in turn should be | one field in struct dccp_sock. In the mail from 18/03/2008 you seemed to be | in favour of grouping, in the one from 15/04/2008 you seemed to contradict | your earlier statement. At least that's how I understood it. | But never mind, both ways are ok for me. | -- Those things only became clearer when looking at the code for a while. I have reworked some of your userland code and done some tests with it. There was a bug with the 32-bit compatibility layer, I have posted a message to netdev. On x86 it was found to work correctly. The University of Aberdeen is a charity registered in Scotland, No SC013683. ^ permalink raw reply [flat|nested] 47+ messages in thread
end of thread, other threads:[~2008-05-02 20:56 UTC | newest]
Thread overview: 47+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-11 10:24 [PATCH 0/5] [DCCP]: Queuing policies Tomasz Grobelny
2008-04-14 6:50 ` Gerrit Renker
2008-04-14 7:39 ` [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set Gerrit Renker
2008-04-14 23:45 ` Tomasz Grobelny
2008-04-15 15:14 ` Gerrit Renker
2008-04-15 15:21 ` Gerrit Renker
2008-04-15 18:01 ` Tomasz Grobelny
2008-04-16 6:20 ` Gerrit Renker
2008-04-16 8:36 ` [DCCP] [RFC] [Patchv3 " Gerrit Renker
2008-04-17 20:03 ` [DCCP] [RFC] [Patchv2 " Tomasz Grobelny
2008-04-18 10:13 ` Gerrit Renker
2008-04-19 20:42 ` Tomasz Grobelny
2008-04-20 16:57 ` Arnaldo Carvalho de Melo
2008-04-20 20:12 ` Tomasz Grobelny
2008-04-21 11:45 ` Patrick McHardy
2008-04-21 13:12 ` Arnaldo Carvalho de Melo
2008-04-21 16:17 ` Tomasz Grobelny
2008-04-22 4:56 ` Patrick McHardy
2008-04-22 20:45 ` Tomasz Grobelny
2008-04-22 22:06 ` David Miller
2008-04-23 0:03 ` Patrick McHardy
2008-04-22 17:41 ` Gerrit Renker
2008-04-22 22:42 ` David Miller
2008-04-25 19:33 ` Tomasz Grobelny
2008-04-25 20:40 ` Arnaldo Carvalho de Melo
2008-04-25 20:58 ` David Miller
2008-04-28 7:21 ` Gerrit Renker
2008-04-28 7:39 ` David Miller
2008-04-22 17:30 ` Gerrit Renker
2008-04-22 20:30 ` Tomasz Grobelny
[not found] ` <20080424220704.0483DBC12@poczta.oswiecenia.net>
2008-04-24 22:16 ` [PATCH 1/1] [DCCP][QPOLICY]: External interface changes Tomasz Grobelny
2008-04-28 15:08 ` Gerrit Renker
2008-04-28 21:29 ` Tomasz Grobelny
2008-04-28 13:10 ` [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set Gerrit Renker
2008-04-28 15:19 ` [DCCP] [RFC] [Patchv3 " Gerrit Renker
2008-04-28 20:12 ` Tomasz Grobelny
2008-04-28 21:03 ` [DCCP] [RFC] [Patchv2 " Tomasz Grobelny
2008-04-30 7:53 ` Gerrit Renker
2008-05-02 20:39 ` Tomasz Grobelny
2008-05-02 20:56 ` Gerrit Renker
2008-04-15 19:38 ` Tomasz Grobelny
2008-04-15 20:04 ` Arnaldo Carvalho de Melo
2008-04-17 20:20 ` Tomasz Grobelny
2008-04-15 20:14 ` inconsistent lock state with kernel 2.6.24.4 Bernard Pidoux
2008-04-16 7:43 ` [DCCP] [RFC] [Patchv2 1/1]: Queuing policies -- reworked version of Tomasz's patch set Gerrit Renker
2008-04-17 18:03 ` Tomasz Grobelny
2008-04-17 18:29 ` Gerrit Renker
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).