libnetfilter_queue question

netfilter.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* libnetfilter_queue question
@ 2011-05-04  6:14 nowhere
  2011-05-04 18:13 ` Alessandro Vesely
  0 siblings, 1 reply; 10+ messages in thread
From: nowhere @ 2011-05-04  6:14 UTC (permalink / raw)
  To: netfilter

[-- Attachment #1: Type: text/plain, Size: 2687 bytes --]

Hello, dear list,

I'm now experimenting with subj, and have a problem which I can not
solve.

There is a simple application, which do the following:

1. Register a queue (or several queues)
2. Reads metadata from nfqueue in a cycle
3. Spreads it into multiple software queues, one queue per nfqueue
4. Worker threads apply some delay according to distribution law
5. Worker threads accept packets, which (if I understand correctly)
still reside in kernel netfilter queue

I have done all the steps to allow for larger kernel queues:
  - sysctl net.core.{r,w}mem{_max,}=16M
  - tc qdisc add dev eth0 root pfifo limit 20000

So I with all these I see no drops on interfaces, interface queue or
netfilter queue (in /proc/net/netfilter/nfnetlink_queue)

Then I do the following to test the setup:
iptables -t mangle -A POSTROUTING -p icmp -d 10.77.130.72 -j NFQUEUE
--queue-num 1

and then start ping. If i do normal ping, everything works like expected

$ping 10.77.130.72
PING 10.77.130.72 (10.77.130.72) 56(84) bytes of data.
64 bytes from 10.77.130.72: icmp_req=1 ttl=64 time=97.0 ms
64 bytes from 10.77.130.72: icmp_req=2 ttl=64 time=97.1 ms
64 bytes from 10.77.130.72: icmp_req=3 ttl=64 time=97.6 ms
64 bytes from 10.77.130.72: icmp_req=4 ttl=64 time=93.6 ms
64 bytes from 10.77.130.72: icmp_req=5 ttl=64 time=101 ms
64 bytes from 10.77.130.72: icmp_req=6 ttl=64 time=94.8 ms

Packets are passed to the target host, delay is applied. Stats from
application and fro iptables counters show consistent figures.

But when I issue flood ping I see this:
$ sudo ping 10.77.130.72 -i0
PING 10.77.130.72 (10.77.130.72) 56(84) bytes of data.
64 bytes from 10.77.130.72: icmp_req=1 ttl=64 time=111 ms
64 bytes from 10.77.130.72: icmp_req=8 ttl=64 time=118 ms
64 bytes from 10.77.130.72: icmp_req=9 ttl=64 time=114 ms
64 bytes from 10.77.130.72: icmp_req=10 ttl=64 time=104 ms
64 bytes from 10.77.130.72: icmp_req=11 ttl=64 time=93.5 ms
64 bytes from 10.77.130.72: icmp_req=12 ttl=64 time=93.9 ms
64 bytes from 10.77.130.72: icmp_req=13 ttl=64 time=94.3 ms
64 bytes from 10.77.130.72: icmp_req=14 ttl=64 time=101 ms
64 bytes from 10.77.130.72: icmp_req=15 ttl=64 time=96.8 ms

There are 7 packets dropped at the beginning. I see same results when
testing with iperf. Several packets at the beginning get lost.
iptables counters show, that NFQUEUE rule has processed all the packets
(15 in this example), app debug output shows 15 processed packets,
nfqueue stat show no drops, tc -s -d qdisc show dev eth0 shows no drops
in the interface queue. But tcpdump has caught only 9 packets on remote
and on local hosts.

There is app's source code here. Maybe, I'm doing something wrong in it?

Thanks

[-- Attachment #2: main.c --]
[-- Type: text/x-csrc, Size: 5324 bytes --]

#include <libnetfilter_queue/libnetfilter_queue.h>

#include <netinet/in.h>
#include <linux/netfilter.h>

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>

#include <sys/socket.h>
#include <arpa/inet.h>

#include <sys/time.h>
#include <time.h>

#include <math.h>

#include <pthread.h>
#include <sys/queue.h>

#define QUEUE_NUM	2

#define BL 		65536

struct queued_pkt {
  char *qp_payload;
  int qp_pktlen;
  int qp_id;
  struct timeval qp_recv;
  STAILQ_ENTRY(queued_pkt) entries;
};

struct queue_data {
  struct nfq_q_handle* q_handle;
  int q_id;
  pthread_t q_thread;
  pthread_mutex_t q_mutex;
  pthread_cond_t q_condvar;
  int q_delay;
  int q_jitter;
  double q_mu;
  double q_sigma;
  double q_xsi;
  STAILQ_HEAD(,queued_pkt) q_head;
};

void* worker_thread (void*);

int callback (struct nfq_q_handle*,
              struct nfgenmsg*,
	      struct nfq_data*, void*);

int main() {
  int rv, fd, i;
  struct nfq_handle *h;

  struct queue_data *queues, *q;
  char *buf;

  queues = (struct queue_data*) calloc (QUEUE_NUM, sizeof (struct queue_data));
  buf = (char*) malloc (BL);

  
  if (!(h = nfq_open())) {
    perror ("open handle");
    exit (1);
  }

  if (nfq_unbind_pf (h, AF_INET) < 0) {
    perror ("unbind NFQUEUE");
    nfq_close (h);
    exit (1);
  }

  if (nfq_bind_pf (h, AF_INET) < 0) {
    perror ("bind nfnetlink");
    nfq_close (h);
    exit (1);
  }

  for (i=0; i<QUEUE_NUM; i++) {

    q = queues + i;
    STAILQ_INIT (&(q->q_head));

    q->q_id = i;
    pthread_mutex_init (&(q->q_mutex), NULL);
    pthread_cond_init (&(q->q_condvar), NULL);

    q->q_xsi = .25;
    q->q_delay = 100;
    q->q_jitter = 10;
    q->q_sigma = ((double) q->q_jitter / 1000.) * (1. - q->q_xsi) * sqrt (1. - 2. * q->q_xsi);
    q->q_mu = ((double) q->q_delay / 1000.) - q->q_sigma / (1. - q->q_xsi);

    fprintf (stderr, "Queue %d: xsi %.3f, sigma %.3f, mu %.3f\n", i, q->q_xsi, q->q_sigma, q->q_mu);

    if (!(q->q_handle = nfq_create_queue (h, i, &callback, q))) {
      perror ("create queue");
      nfq_close (h);
      exit (1);
    }

    if (nfq_set_mode (q->q_handle, NFQNL_COPY_META, 0) < 0) {
      perror ("set mode");
      nfq_destroy_queue (q->q_handle);
      nfq_close (h);
      exit (1);
    }

    nfq_set_queue_maxlen (q->q_handle, 20240);
    pthread_create (&(q->q_thread), NULL, &worker_thread, (void*) q);

  }
  
  fd = nfq_fd (h);
  
  while (1) {
    
    rv = recv (fd, buf, BL, MSG_TRUNC);
    if (rv < 0 && errno == EINTR) continue;
    if (rv > BL) {
      fprintf (stderr, "No space\n");
      continue;
    }

    nfq_handle_packet (h, buf, rv);

  }

  //nfq_destroy_queue (qh);
  //nfq_unbind_pf (h, AF_INET);
  nfq_close (h);
  free (buf);
  return 0;
}

int callback (struct nfq_q_handle *qh,
	      struct nfgenmsg *nfmsg,
	      struct nfq_data *nfad, void *data) {
   
   struct queue_data *queue = (struct queue_data*) data;
   struct queued_pkt *pkt;
   char *pl;
   struct nfqnl_msg_packet_hdr *ph;
   
   pkt = (struct queued_pkt*) calloc (1, sizeof (struct queued_pkt));

   if (!(ph = nfq_get_msg_packet_hdr (nfad))) {
       perror ("get hdr");
       return 0;
   }

   pkt->qp_id = htonl (ph->packet_id);
   
   gettimeofday (&pkt->qp_recv, NULL);
/*   if ((pkt->qp_pktlen = nfq_get_payload (nfad, &pl)) > 0) {
     pkt->qp_payload = (char*) malloc (pkt->qp_pktlen);
     memcpy (pkt->qp_payload, pl, pkt->qp_pktlen);
   }*/

   pthread_mutex_lock (&(queue->q_mutex));
   STAILQ_INSERT_TAIL (&(queue->q_head), pkt, entries);
   pthread_cond_signal (&(queue->q_condvar));
   pthread_mutex_unlock (&(queue->q_mutex));
   return 0;
}


void* worker_thread (void *data) {
   struct queue_data *queue = (struct queue_data*) data;
   struct queued_pkt *pkt;

   struct timeval cur_time;
   struct timespec deq_time;
   double real_delay;

   double cur_ts, deq_ts, recv_ts;

   char *buf;

   while (1) {
     pthread_mutex_lock (&(queue->q_mutex));

     while (STAILQ_EMPTY (&(queue->q_head)))
       pthread_cond_wait (&(queue->q_condvar), &(queue->q_mutex));

     /* Dequeue packet */
     pkt = STAILQ_FIRST (&(queue->q_head));

     STAILQ_REMOVE_HEAD (&(queue->q_head), entries);
     pthread_mutex_unlock (&(queue->q_mutex));

     real_delay = queue->q_mu + queue->q_sigma * (pow ((double) (random()) / (double) RAND_MAX, -1. * queue->q_xsi) - 1.) / 
     					queue->q_xsi;

     //real_delay = 100e-3;

     recv_ts = (double) pkt->qp_recv.tv_sec + (double) pkt->qp_recv.tv_usec / 1000000.;
     deq_ts = recv_ts + real_delay;

     gettimeofday (&cur_time, NULL);
     cur_ts = (double) cur_time.tv_sec + (double) cur_time.tv_usec / 1000000.;

     real_delay = deq_ts - cur_ts;
       printf ("Queue %d Packet ID %d, REC (%d, %d) delay %.3fus\n", queue->q_id, pkt->qp_id, 
                 pkt->qp_recv.tv_sec, pkt->qp_recv.tv_usec, real_delay * 1000000.);
     if (real_delay > 0) {
       deq_time.tv_sec = real_delay;
       deq_time.tv_nsec = (real_delay - deq_time.tv_sec) * 1000000000; 

       nanosleep (&deq_time, NULL);
     }

     if (pkt->qp_pktlen) {
       nfq_set_verdict (queue->q_handle, pkt->qp_id, NF_ACCEPT, pkt->qp_pktlen, pkt->qp_payload);
       free (pkt->qp_payload);
     } else
       nfq_set_verdict (queue->q_handle, pkt->qp_id, NF_ACCEPT, 0, NULL);

     free (pkt);
   }

   return 0;
}

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: libnetfilter_queue question
  2011-05-04  6:14 libnetfilter_queue question nowhere
@ 2011-05-04 18:13 ` Alessandro Vesely
  2011-05-04 18:32   ` Nikolay S.
  0 siblings, 1 reply; 10+ messages in thread
From: Alessandro Vesely @ 2011-05-04 18:13 UTC (permalink / raw)
  To: netfilter

On 04.05.2011 08:14, nowhere wrote:
> 5. Worker threads accept packets, which (if I understand correctly)
> still reside in kernel netfilter queue

Part of them are copied to user's space (no payload but only metadata,
according to your use of nfq_set_mode).

> Then I do the following to test the setup:
> iptables -t mangle -A POSTROUTING -p icmp -d 10.77.130.72 -j NFQUEUE
> --queue-num 1
> 
> and then start ping. If i do normal ping, everything works like expected
> 
> $ping 10.77.130.72
> PING 10.77.130.72 (10.77.130.72) 56(84) bytes of data.
> 64 bytes from 10.77.130.72: icmp_req=1 ttl=64 time=97.0 ms
> 64 bytes from 10.77.130.72: icmp_req=2 ttl=64 time=97.1 ms
> 64 bytes from 10.77.130.72: icmp_req=3 ttl=64 time=97.6 ms
> 64 bytes from 10.77.130.72: icmp_req=4 ttl=64 time=93.6 ms
> 64 bytes from 10.77.130.72: icmp_req=5 ttl=64 time=101 ms
> 64 bytes from 10.77.130.72: icmp_req=6 ttl=64 time=94.8 ms
> 
> Packets are passed to the target host, delay is applied. Stats from
> application and fro iptables counters show consistent figures.
> 
> But when I issue flood ping I see this:
> $ sudo ping 10.77.130.72 -i0
> PING 10.77.130.72 (10.77.130.72) 56(84) bytes of data.
> 64 bytes from 10.77.130.72: icmp_req=1 ttl=64 time=111 ms
> 64 bytes from 10.77.130.72: icmp_req=8 ttl=64 time=118 ms
> 64 bytes from 10.77.130.72: icmp_req=9 ttl=64 time=114 ms
> 64 bytes from 10.77.130.72: icmp_req=10 ttl=64 time=104 ms
> 64 bytes from 10.77.130.72: icmp_req=11 ttl=64 time=93.5 ms
> 64 bytes from 10.77.130.72: icmp_req=12 ttl=64 time=93.9 ms
> 64 bytes from 10.77.130.72: icmp_req=13 ttl=64 time=94.3 ms
> 64 bytes from 10.77.130.72: icmp_req=14 ttl=64 time=101 ms
> 64 bytes from 10.77.130.72: icmp_req=15 ttl=64 time=96.8 ms
> 
> There are 7 packets dropped at the beginning.

I assume you meant 6 (15 - 9)

> Several packets at the beginning get lost.

Are they always at the beginning, or does that depend on the distribution of
delays?

> iptables counters show, that NFQUEUE rule has processed all the packets
> (15 in this example), app debug output shows 15 processed packets,

They were seen, but it seems the verdict didn't arrive in time for 6 of them.

> nfqueue stat show no drops, tc -s -d qdisc show dev eth0 shows no drops
> in the interface queue. But tcpdump has caught only 9 packets on remote
> and on local hosts.

The relationship between filter and local tcpdump is not always obvious, IME.
 Perhaps, your choice of table/chain makes it better.  Anyway, the remote
host cannot get it wrong.  Did you dump requests, responses, or both?

> There is app's source code here. Maybe, I'm doing something wrong in it?

I see nothing wrong in it.  However, I'd print out occurrences of rv < 0
after recv() and look for errno==ENOBUFS in particular.  It should report
lost packets.

hth

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: libnetfilter_queue question
  2011-05-04 18:13 ` Alessandro Vesely
@ 2011-05-04 18:32   ` Nikolay S.
  2011-05-05  9:12     ` Alessandro Vesely
  0 siblings, 1 reply; 10+ messages in thread
From: Nikolay S. @ 2011-05-04 18:32 UTC (permalink / raw)
  To: Alessandro Vesely; +Cc: netfilter

В Срд, 04/05/2011 в 20:13 +0200, Alessandro Vesely пишет:
> On 04.05.2011 08:14, nowhere wrote:
> > 5. Worker threads accept packets, which (if I understand correctly)
> > still reside in kernel netfilter queue
> 
> Part of them are copied to user's space (no payload but only metadata,
> according to your use of nfq_set_mode).
> 
> > Then I do the following to test the setup:
> > iptables -t mangle -A POSTROUTING -p icmp -d 10.77.130.72 -j NFQUEUE
> > --queue-num 1
> > 
> > and then start ping. If i do normal ping, everything works like expected
> > 
> > $ping 10.77.130.72
> > PING 10.77.130.72 (10.77.130.72) 56(84) bytes of data.
> > 64 bytes from 10.77.130.72: icmp_req=1 ttl=64 time=97.0 ms
> > 64 bytes from 10.77.130.72: icmp_req=2 ttl=64 time=97.1 ms
> > 64 bytes from 10.77.130.72: icmp_req=3 ttl=64 time=97.6 ms
> > 64 bytes from 10.77.130.72: icmp_req=4 ttl=64 time=93.6 ms
> > 64 bytes from 10.77.130.72: icmp_req=5 ttl=64 time=101 ms
> > 64 bytes from 10.77.130.72: icmp_req=6 ttl=64 time=94.8 ms
> > 
> > Packets are passed to the target host, delay is applied. Stats from
> > application and fro iptables counters show consistent figures.
> > 
> > But when I issue flood ping I see this:
> > $ sudo ping 10.77.130.72 -i0
> > PING 10.77.130.72 (10.77.130.72) 56(84) bytes of data.
> > 64 bytes from 10.77.130.72: icmp_req=1 ttl=64 time=111 ms
> > 64 bytes from 10.77.130.72: icmp_req=8 ttl=64 time=118 ms
> > 64 bytes from 10.77.130.72: icmp_req=9 ttl=64 time=114 ms
> > 64 bytes from 10.77.130.72: icmp_req=10 ttl=64 time=104 ms
> > 64 bytes from 10.77.130.72: icmp_req=11 ttl=64 time=93.5 ms
> > 64 bytes from 10.77.130.72: icmp_req=12 ttl=64 time=93.9 ms
> > 64 bytes from 10.77.130.72: icmp_req=13 ttl=64 time=94.3 ms
> > 64 bytes from 10.77.130.72: icmp_req=14 ttl=64 time=101 ms
> > 64 bytes from 10.77.130.72: icmp_req=15 ttl=64 time=96.8 ms
> > 
> > There are 7 packets dropped at the beginning.
> 
> I assume you meant 6 (15 - 9)

Yes :)

> 
> > Several packets at the beginning get lost.
> 
> Are they always at the beginning, or does that depend on the distribution of
> delays?

Indeed. The first packet is never dropped, then comes a serie of drops
(the number of dropped packets depends on the sending rate, i.e. testing
with iperf on, say, 50 Mbit/s shows drops of ~800 packets) and after
that no drops at all. Distribution and it's parameters do not matter
except for zeroes: if there is no artificial delay, no packets are
dropped.

> 
> > iptables counters show, that NFQUEUE rule has processed all the packets
> > (15 in this example), app debug output shows 15 processed packets,
> 
> They were seen, but it seems the verdict didn't arrive in time for 6 of them.
> 
> > nfqueue stat show no drops, tc -s -d qdisc show dev eth0 shows no drops
> > in the interface queue. But tcpdump has caught only 9 packets on remote
> > and on local hosts.
> 
> The relationship between filter and local tcpdump is not always obvious, IME.
>  Perhaps, your choice of table/chain makes it better.  Anyway, the remote
> host cannot get it wrong.  Did you dump requests, responses, or both?

Requests only. Do you recommend queuing from another tables/chain?
I tried OUTPUT in filter table, but did not see any difference...

> 
> > There is app's source code here. Maybe, I'm doing something wrong in it?
> 
> I see nothing wrong in it.  However, I'd print out occurrences of rv < 0
> after recv() and look for errno==ENOBUFS in particular.  It should report
> lost packets

Yes, I did it (actually this was one of the first checks). There are no
situations when rv < 0.

> .
> 
> hth
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: libnetfilter_queue question
  2011-05-04 18:32   ` Nikolay S.
@ 2011-05-05  9:12     ` Alessandro Vesely
  2011-05-05  9:24       ` nowhere
  0 siblings, 1 reply; 10+ messages in thread
From: Alessandro Vesely @ 2011-05-05  9:12 UTC (permalink / raw)
  To: netfilter

On 04.05.2011 20:32, Nikolay S. wrote:
> В Срд, 04/05/2011 в 20:13 +0200, Alessandro Vesely пишет:
>> On 04.05.2011 08:14, nowhere wrote:
>>> Several packets at the beginning get lost.
>>
>> Are they always at the beginning, or does that depend on the distribution of
>> delays?
> 
> Indeed. The first packet is never dropped, then comes a serie of drops
> (the number of dropped packets depends on the sending rate, i.e. testing
> with iperf on, say, 50 Mbit/s shows drops of ~800 packets) and after
> that no drops at all. Distribution and it's parameters do not matter
> except for zeroes: if there is no artificial delay, no packets are
> dropped.

Looks like pretty reproducible.  I'll have a try with your code when I get
back to my place.

>> I see nothing wrong in it.  However, I'd print out occurrences of rv < 0
>> after recv() and look for errno==ENOBUFS in particular.  It should report
>> lost packets
> 
> Yes, I did it (actually this was one of the first checks). There are no
> situations when rv < 0.

Did you check return codes from nfq_set_verdict()?  If that is 0, it must be
a bug.  What versions of library and kernel are you using?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: libnetfilter_queue question
  2011-05-05  9:12     ` Alessandro Vesely
@ 2011-05-05  9:24       ` nowhere
  2011-05-11 17:27         ` NFQUEUE looses packets between arrival and verdict Alessandro Vesely
  0 siblings, 1 reply; 10+ messages in thread
From: nowhere @ 2011-05-05  9:24 UTC (permalink / raw)
  To: Alessandro Vesely; +Cc: netfilter

В Чтв, 05/05/2011 в 11:12 +0200, Alessandro Vesely пишет:
> On 04.05.2011 20:32, Nikolay S. wrote:
> > В Срд, 04/05/2011 в 20:13 +0200, Alessandro Vesely пишет:
> >> On 04.05.2011 08:14, nowhere wrote:
> >>> Several packets at the beginning get lost.
> >>
> >> Are they always at the beginning, or does that depend on the distribution of
> >> delays?
> > 
> > Indeed. The first packet is never dropped, then comes a serie of drops
> > (the number of dropped packets depends on the sending rate, i.e. testing
> > with iperf on, say, 50 Mbit/s shows drops of ~800 packets) and after
> > that no drops at all. Distribution and it's parameters do not matter
> > except for zeroes: if there is no artificial delay, no packets are
> > dropped.
> 
> Looks like pretty reproducible.  I'll have a try with your code when I get
> back to my place.
> 
> >> I see nothing wrong in it.  However, I'd print out occurrences of rv < 0
> >> after recv() and look for errno==ENOBUFS in particular.  It should report
> >> lost packets
> > 
> > Yes, I did it (actually this was one of the first checks). There are no
> > situations when rv < 0.
> 
> Did you check return codes from nfq_set_verdict()?  If that is 0, it must be
> a bug.  What versions of library and kernel are you using?

nfq_set_verdict() returns 32

I'm using Gentoo x86_64 v2.6.38-gentoo-r4 (2.6.38.5 + minor patches).
libnetfilter_queue is 0.0.17

> 
> --
> To unsubscribe from this list: send the line "unsubscribe netfilter" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 10+ messages in thread

* NFQUEUE looses packets between arrival and verdict
  2011-05-05  9:24       ` nowhere
@ 2011-05-11 17:27         ` Alessandro Vesely
  2011-05-11 22:56           ` Ed W
  2011-05-12  9:40           ` nowhere
  0 siblings, 2 replies; 10+ messages in thread
From: Alessandro Vesely @ 2011-05-11 17:27 UTC (permalink / raw)
  To: nowhere; +Cc: netfilter

Finally I've found some time to try that.  Sorry for the delay.

On 05/May/11 11:24, nowhere wrote:
>>> Indeed. The first packet is never dropped, then comes a serie of drops
>>> (the number of dropped packets depends on the sending rate, i.e. testing
>>> with iperf on, say, 50 Mbit/s shows drops of ~800 packets) and after
>>> that no drops at all. Distribution and it's parameters do not matter
>>> except for zeroes: if there is no artificial delay, no packets are
>>> dropped.

It seems enough to avoid delaying the call to nfq_set_verdict for the
first packet of a burst.  For a shot in the dark, packets seem to get
lost if they arrive between the first one and the corresponding call
to nfq_set_verdict.  Indeed, setting a fixed real_delay of 0.2, with
ping -i 0.2 it looses no packets, with ping -i 0.19 it looses just the
second one, with ping -i 0.09 icmp_reqs #2 and #3.

No error is returned, whether NETLINK_NO_ENOBUFS is set or not.

>> Did you check return codes from nfq_set_verdict()?  If that is 0, it must be
>> a bug.                       I meant >= 0 here ----------------^
> 
> nfq_set_verdict() returns 32

AFAIK the library does not queue data, so I'd guess the bug is in the
kernel.  I hope someone else chimes in and explains some more of this.
(I change the subject trying to draw attention.)

> I'm using Gentoo x86_64 v2.6.38-gentoo-r4 (2.6.38.5 + minor patches).
> libnetfilter_queue is 0.0.17

Same on Debian x86_64 2.6.32-something, and libnetfilter_queue 0.0.17

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: NFQUEUE looses packets between arrival and verdict
  2011-05-11 17:27         ` NFQUEUE looses packets between arrival and verdict Alessandro Vesely
@ 2011-05-11 22:56           ` Ed W
  2011-05-12  9:40           ` nowhere
  1 sibling, 0 replies; 10+ messages in thread
From: Ed W @ 2011-05-11 22:56 UTC (permalink / raw)
  To: Alessandro Vesely; +Cc: nowhere, netfilter

On 11/05/2011 18:27, Alessandro Vesely wrote:
>   For a shot in the dark, packets seem to get
> lost if they arrive between the first one and the corresponding call
> to nfq_set_verdict.  

I know absolutely zero about your problem, but the bit above reads
exactly like a buffer overflow somewhere?

Good luck

Ed W

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: NFQUEUE looses packets between arrival and verdict
  2011-05-11 17:27         ` NFQUEUE looses packets between arrival and verdict Alessandro Vesely
  2011-05-11 22:56           ` Ed W
@ 2011-05-12  9:40           ` nowhere
  2011-05-12 18:03             ` NFQUEUE the plot is growing Alessandro Vesely
  1 sibling, 1 reply; 10+ messages in thread
From: nowhere @ 2011-05-12  9:40 UTC (permalink / raw)
  To: Alessandro Vesely; +Cc: netfilter

В Срд, 11/05/2011 в 19:27 +0200, Alessandro Vesely пишет:
> Finally I've found some time to try that.  Sorry for the delay.
> 
> On 05/May/11 11:24, nowhere wrote:
> >>> Indeed. The first packet is never dropped, then comes a serie of drops
> >>> (the number of dropped packets depends on the sending rate, i.e. testing
> >>> with iperf on, say, 50 Mbit/s shows drops of ~800 packets) and after
> >>> that no drops at all. Distribution and it's parameters do not matter
> >>> except for zeroes: if there is no artificial delay, no packets are
> >>> dropped.
> 
> It seems enough to avoid delaying the call to nfq_set_verdict for the
> first packet of a burst.  For a shot in the dark, packets seem to get
> lost if they arrive between the first one and the corresponding call
> to nfq_set_verdict.  Indeed, setting a fixed real_delay of 0.2, with
> ping -i 0.2 it looses no packets, with ping -i 0.19 it looses just the
> second one, with ping -i 0.09 icmp_reqs #2 and #3.
> 
> No error is returned, whether NETLINK_NO_ENOBUFS is set or not.

Well, seems like this is the case. If nfqueue becomes empty, first
enqueued packet must not be delayed. Adding queue length check and skip
delaying first packet eliminates drops.
But... delay may be due to packet processing, and hence unavoidable. I
mean, this "workaround" is not a workaround...

> >> Did you check return codes from nfq_set_verdict()?  If that is 0, it must be
> >> a bug.                       I meant >= 0 here ----------------^
> > 
> > nfq_set_verdict() returns 32
> 
> AFAIK the library does not queue data, so I'd guess the bug is in the
> kernel.  I hope someone else chimes in and explains some more of this.
> (I change the subject trying to draw attention.)
> 
> > I'm using Gentoo x86_64 v2.6.38-gentoo-r4 (2.6.38.5 + minor patches).
> > libnetfilter_queue is 0.0.17
> 
> Same on Debian x86_64 2.6.32-something, and libnetfilter_queue 0.0.17



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: NFQUEUE the plot is growing...
  2011-05-12  9:40           ` nowhere
@ 2011-05-12 18:03             ` Alessandro Vesely
  2011-05-13 18:25               ` Nikolay S.
  0 siblings, 1 reply; 10+ messages in thread
From: Alessandro Vesely @ 2011-05-12 18:03 UTC (permalink / raw)
  To: nowhere; +Cc: netfilter

[-- Attachment #1: Type: text/plain, Size: 1720 bytes --]

On 12/May/11 11:40, nowhere wrote:
>> It seems enough to avoid delaying the call to nfq_set_verdict for the
>> first packet of a burst.  For a shot in the dark, packets seem to get
>> lost if they arrive between the first one and the corresponding call
>> to nfq_set_verdict.  Indeed, setting a fixed real_delay of 0.2, with
>> ping -i 0.2 it looses no packets, with ping -i 0.19 it looses just the
>> second one, with ping -i 0.09 icmp_reqs #2 and #3.
>> 
>> No error is returned, whether NETLINK_NO_ENOBUFS is set or not.
> 
> Well, seems like this is the case. If nfqueue becomes empty, first
> enqueued packet must not be delayed.

I retract, possibly I've been too hasty blaming nfnetlink queue.  I
made a simple variation of nfqnl_test.c --which I attach.  It just
accepts the previous packet id.  The "last" packet is obviously always
lost.  Because of this bug(?), I also loose the second packet of a
sequence of pings, no matter the speed.

However, if I "ping -c 1" using two terminal windows, I correctly
receive all odd ids in one window and even ones in the other (except
last pkt).  In this case, I delay every packet.  Also, if I run a
sequence from a window, and, immediately after it starts, run a single
ping using the other window, then both the single ping and the
sequence (except last pkt) go correctly through.

I don't understand how come the kernel+filter system can distinguish
between a second packet coming as part of a sequence and a second
packet coming asynchronously, given that packets are not inspected.
Nice puzzle, isn't it?


NB, I used iptables -t mangle -A POSTROUTING -p icmp -d 172.25.197.158
-j NFQUEUE --queue-num 13, as in
http://www.spinics.net/lists/netfilter/msg50829.html

[-- Attachment #2: main2.c --]
[-- Type: text/plain, Size: 3505 bytes --]

/*

$ ping -c 8 172.25.197.158
PING 172.25.197.158 (172.25.197.158) 56(84) bytes of data.
64 bytes from 172.25.197.158: icmp_req=1 ttl=128 time=1010 ms
64 bytes from 172.25.197.158: icmp_req=3 ttl=128 time=1000 ms
64 bytes from 172.25.197.158: icmp_req=4 ttl=128 time=1000 ms
64 bytes from 172.25.197.158: icmp_req=5 ttl=128 time=999 ms
64 bytes from 172.25.197.158: icmp_req=6 ttl=128 time=1000 ms
64 bytes from 172.25.197.158: icmp_req=7 ttl=128 time=1010 ms

--- 172.25.197.158 ping statistics ---
8 packets transmitted, 6 received, 25% packet loss, time 7020ms
rtt min/avg/max/mdev = 999.452/1003.574/1010.633/4.883 ms, pipe 2


# grep -E ^ *13 /proc/net/netfilter/nfnetlink_queue
   13  13584     1 1     0     0     0        8  1

   ^queue_num,
       ^peer_pid,
                 ^queue_total,
                   ^copy_mode,
                          ^copy_range,
                               ^queue_dropped,
                                      ^queue_user_dropped,
                                              ^id_sequence,
                                                  ^1
*/



#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <netinet/in.h>
#include <linux/types.h>
#include <linux/netfilter.h>		/* for NF_ACCEPT */

#include <libnetfilter_queue/libnetfilter_queue.h>

#include <errno.h>

struct packet_data
{
	u_int32_t id, good;
};

static int cb(struct nfq_q_handle *qh, struct nfgenmsg *nu,
	      struct nfq_data *nfa, void *data)
{
	struct packet_data *pd = (struct packet_data*)data;
	struct nfqnl_msg_packet_hdr *ph = nfq_get_msg_packet_hdr(nfa);
	
	if (pd && ph)
	{
		if (pd->good) // since last time
			nfq_set_verdict(qh, pd->id, NF_ACCEPT, 0, NULL);

		pd->id = ntohl(ph->packet_id);
		pd->good = 1;
		printf("received packet %d\n", pd->id);
	}
	
	return 0;

	(void)nu;
}

int main()
{
	struct nfq_handle *h;
	struct nfq_q_handle *qh;
	struct packet_data pd;
	int fd;
	int rv;
	char buf[4096] __attribute__ ((aligned));

	printf("opening library handle\n");
	h = nfq_open();
	if (!h) {
		fprintf(stderr, "error during nfq_open()\n");
		exit(1);
	}

	printf("unbinding existing nf_queue handler for AF_INET (if any)\n");
	if (nfq_unbind_pf(h, AF_INET) < 0) {
		fprintf(stderr, "error during nfq_unbind_pf()\n");
		exit(1);
	}

	printf("binding nfnetlink_queue as nf_queue handler for AF_INET\n");
	if (nfq_bind_pf(h, AF_INET) < 0) {
		fprintf(stderr, "error during nfq_bind_pf()\n");
		exit(1);
	}

	pd.good = 0;
	qh = nfq_create_queue(h,  13 /* <--queue number here */, &cb, &pd);
	if (!qh) {
		fprintf(stderr, "error during nfq_create_queue()\n");
		exit(1);
	}

	if (nfq_set_mode(qh, NFQNL_COPY_META, 0) < 0) {
		fprintf(stderr, "can't set packet_copy mode\n");
		exit(1);
	}

	fd = nfq_fd(h);

	for (;;) {
		if ((rv = recv(fd, buf, sizeof(buf), 0)) >= 0) {
			nfq_handle_packet(h, buf, rv);
			continue;
		}
		/* if the computer is slower than the network the buffer
		* may fill up. Depending on the application, this error
		* may be ignored */		
		if (errno == ENOBUFS) {
			printf("pkt lost!!\n");
			continue;
		}
		printf("recv failed: errno=%d (%s)\n",
			errno, strerror(errno));
	}

	printf("unbinding from queue 0\n");
	nfq_destroy_queue(qh);

#ifdef INSANE
	/* normally, applications SHOULD NOT issue this command, since
	 * it detaches other programs/sockets from AF_INET, too ! */
	printf("unbinding from AF_INET\n");
	nfq_unbind_pf(h, AF_INET);
#endif

	printf("closing library handle\n");
	nfq_close(h);

	exit(0);
}

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: NFQUEUE the plot is growing...
  2011-05-12 18:03             ` NFQUEUE the plot is growing Alessandro Vesely
@ 2011-05-13 18:25               ` Nikolay S.
  0 siblings, 0 replies; 10+ messages in thread
From: Nikolay S. @ 2011-05-13 18:25 UTC (permalink / raw)
  To: Alessandro Vesely; +Cc: netfilter

В Чтв, 12/05/2011 в 20:03 +0200, Alessandro Vesely пишет:
> On 12/May/11 11:40, nowhere wrote:
> >> It seems enough to avoid delaying the call to nfq_set_verdict for the
> >> first packet of a burst.  For a shot in the dark, packets seem to get
> >> lost if they arrive between the first one and the corresponding call
> >> to nfq_set_verdict.  Indeed, setting a fixed real_delay of 0.2, with
> >> ping -i 0.2 it looses no packets, with ping -i 0.19 it looses just the
> >> second one, with ping -i 0.09 icmp_reqs #2 and #3.
> >> 
> >> No error is returned, whether NETLINK_NO_ENOBUFS is set or not.
> > 
> > Well, seems like this is the case. If nfqueue becomes empty, first
> > enqueued packet must not be delayed.
> 
> I retract, possibly I've been too hasty blaming nfnetlink queue.  I
> made a simple variation of nfqnl_test.c --which I attach.  It just
> accepts the previous packet id.  The "last" packet is obviously always
> lost.  Because of this bug(?), I also loose the second packet of a
> sequence of pings, no matter the speed.
> 
> However, if I "ping -c 1" using two terminal windows, I correctly
> receive all odd ids in one window and even ones in the other (except
> last pkt).  In this case, I delay every packet.  Also, if I run a
> sequence from a window, and, immediately after it starts, run a single
> ping using the other window, then both the single ping and the
> sequence (except last pkt) go correctly through.
> 
> I don't understand how come the kernel+filter system can distinguish
> between a second packet coming as part of a sequence and a second
> packet coming asynchronously, given that packets are not inspected.
> Nice puzzle, isn't it?
> 
> 
> NB, I used iptables -t mangle -A POSTROUTING -p icmp -d 172.25.197.158
> -j NFQUEUE --queue-num 13, as in
> http://www.spinics.net/lists/netfilter/msg50829.html


Hi there,

There is a case that look similar to ours:
http://marc.info/?l=netfilter-devel&m=129016166319433&w=2

So I tried putting iptables rule in raw table (afaik it is passed before
conntrack, so delayed packets are not being tracked before entering
queue) - and the problem is solved, there are no drops anymore.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-05-13 18:25 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-04  6:14 libnetfilter_queue question nowhere
2011-05-04 18:13 ` Alessandro Vesely
2011-05-04 18:32   ` Nikolay S.
2011-05-05  9:12     ` Alessandro Vesely
2011-05-05  9:24       ` nowhere
2011-05-11 17:27         ` NFQUEUE looses packets between arrival and verdict Alessandro Vesely
2011-05-11 22:56           ` Ed W
2011-05-12  9:40           ` nowhere
2011-05-12 18:03             ` NFQUEUE the plot is growing Alessandro Vesely
2011-05-13 18:25               ` Nikolay S.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).