public inbox for b.a.t.m.a.n@lists.open-mesh.org
 help / color / mirror / Atom feed
* [B.A.T.M.A.N.] originator tq_avg oscilations
@ 2009-06-23 15:06 Andrew Lunn
  2009-06-23 20:00 ` Marek Lindner
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Lunn @ 2009-06-23 15:06 UTC (permalink / raw)
  To: B.A.T.M.A.N

Hi Folks

I've been playing with B.A.T.M.A.N advanced for a few weeks now. 

One of the scenarios where we might want to use it is nomadic
vehicles. Each vehicle has a wifi radio which is used to build a mesh
between the vehicles when the vehicles are parked together at a
location. I said the vehicles are nomadic. By that i mean they tend to
stay in one place for a while, and then move on. They can move
individually, or in groups. For the moment we are not interested in
meshing while on the move. 

I've run into a "problem" when one vehicle/node moves away from the
rest of the vehicles/nodes. It is taking a long time for the mesh to
realize the node has gone and rebuild the mesh. With a 500ms
orig_interval in our little test network, pings get lost for an
average of 26 seconds. The variation is great, sometimes it reroutes
in 10 seconds, sometimes in 50 seconds. 

So i want to improve this. The first thing i did was make some plots
of the tq_avg value, as shown in /proc/net/batman-adv/originators. I
look at one particular originator and plot the different tq_avg values
for the list of neighbors. Attaches is a png image showing this.

I was surprised to see that its unstable and oscillating. The
different tq_avg values mostly oscillate together, as shown in the
figure, but i've also seen cases when they oscillate 180 degrees out
of phase. In that case, the routing flips on the cross overs.

Is this normal? Is it expected behavior?

Has anybody worked on making re-routing around disappears nodes
faster?

        Thanks
                Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [B.A.T.M.A.N.] originator tq_avg oscilations
  2009-06-23 15:06 Andrew Lunn
@ 2009-06-23 20:00 ` Marek Lindner
  0 siblings, 0 replies; 6+ messages in thread
From: Marek Lindner @ 2009-06-23 20:00 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking


Hi,

> I've been playing with B.A.T.M.A.N advanced for a few weeks now.

welcome in the jungle.  :-)


> I've run into a "problem" when one vehicle/node moves away from the
> rest of the vehicles/nodes. It is taking a long time for the mesh to
> realize the node has gone and rebuild the mesh. With a 500ms
> orig_interval in our little test network, pings get lost for an
> average of 26 seconds. The variation is great, sometimes it reroutes
> in 10 seconds, sometimes in 50 seconds.

This is a known issue which is due to the current protocol design and its 
effect is increased by some code defects. 
We have a bunch of ideas how to improve the situation but we wanted to come to 
an end on our current construction sites before opening a new one. Hence we 
released a stable layer 3 version and the kernel module will follow soon 
before touching the routing code again.

If you are willing to test a few things to reduce the effect we can start right 
away. You could set TQ_GLOBAL_WINDOW_SIZE to 1 in order to deactivate the 
averaging of the TQ values. Aslo, some people reported that reducing the hop 
penalty also will increase the speed.

Beyond that we have to modify the code which is on my todo list but will 
become experimental.


> I was surprised to see that its unstable and oscillating. The
> different tq_avg values mostly oscillate together, as shown in the
> figure, but i've also seen cases when they oscillate 180 degrees out
> of phase. In that case, the routing flips on the cross overs.

The TQ is obtained by sending broadcasts which get lost easily. If you have 
interferences / collisions / etc these values might fluctuate a bit. Although 
your values look rather unusual.

Regards,
Marek



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [B.A.T.M.A.N.] originator tq_avg oscilations
@ 2009-06-24  8:28 Andrew Lunn
  2009-06-24 14:38 ` Marek Lindner
  2009-06-24 21:06 ` Linus Lüssing
  0 siblings, 2 replies; 6+ messages in thread
From: Andrew Lunn @ 2009-06-24  8:28 UTC (permalink / raw)
  To: B.A.T.M.A.N

> If you are willing to test a few things to reduce the effect we can
> start right away. You could set TQ_GLOBAL_WINDOW_SIZE to 1 in order
> to deactivate the averaging of the TQ values. Aslo, some people
> reported that reducing the hop penalty also will increase the speed.

I already tried changing TQ_GLOBAL_WINDOW_SIZE to 5 instead of 10 and
that helped. Changing it to one is interesting. I've not tried it yet,
but i would of thought some ring buffer was needed to handle the 0s
which are added when originator messages are received for other
neighbors. 

I already tried reducing the hop penalty. However testing showed it
behaved worse. This i don't understand. So i'm guessing my test setup
changed between my different tests. So i need to run the test again,
both the control and the modified hop penalty.

What ideas do you have for improving the algorithms. Depending on the
scale of work needed i might have some time to do some coding.

     Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [B.A.T.M.A.N.] originator tq_avg oscilations
  2009-06-24  8:28 [B.A.T.M.A.N.] originator tq_avg oscilations Andrew Lunn
@ 2009-06-24 14:38 ` Marek Lindner
  2009-06-24 21:06 ` Linus Lüssing
  1 sibling, 0 replies; 6+ messages in thread
From: Marek Lindner @ 2009-06-24 14:38 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

On Wednesday 24 June 2009 16:28:45 Andrew Lunn wrote:
> I already tried changing TQ_GLOBAL_WINDOW_SIZE to 5 instead of 10 and
> that helped. Changing it to one is interesting. I've not tried it yet,
> but i would of thought some ring buffer was needed to handle the 0s
> which are added when originator messages are received for other
> neighbors.

Oh yeah, you right about the 0s.


> What ideas do you have for improving the algorithms. Depending on the
> scale of work needed i might have some time to do some coding.

I'd suggest we discuss this in a more interactive mode as chatting via IRC. 
What about the batman IRC channel ?

Regards,
Marek


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [B.A.T.M.A.N.] originator tq_avg oscilations
  2009-06-24  8:28 [B.A.T.M.A.N.] originator tq_avg oscilations Andrew Lunn
  2009-06-24 14:38 ` Marek Lindner
@ 2009-06-24 21:06 ` Linus Lüssing
  2009-06-25 19:47   ` Andrew Lunn
  1 sibling, 1 reply; 6+ messages in thread
From: Linus Lüssing @ 2009-06-24 21:06 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

[-- Attachment #1: Type: text/plain, Size: 794 bytes --]

> What ideas do you have for improving the algorithms. Depending on the
> scale of work needed i might have some time to do some coding.
I've been sending Simon and Marek some emails lately about some
ideas for dynamic originator intervals. Basically the idea was,
to not use any external devices like GPS (weather station, news
paper articles, fortune tellers, ...) for detecting the
dynamicness of the environment and changing the OGM-interval on
those pieces of information, but that the information about the
changes of the TQ-values over time should be sufficient to detect
changing conditions.

If you (or others) might be interested, I could translate those
ideas to English and post them on our wiki-page. I would love to
get some more feedback and enhancements to them.

Cheers, Linus

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 835 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [B.A.T.M.A.N.] originator tq_avg oscilations
  2009-06-24 21:06 ` Linus Lüssing
@ 2009-06-25 19:47   ` Andrew Lunn
  0 siblings, 0 replies; 6+ messages in thread
From: Andrew Lunn @ 2009-06-25 19:47 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

On Wed, Jun 24, 2009 at 11:06:31PM +0200, Linus Lüssing wrote:
> > What ideas do you have for improving the algorithms. Depending on the
> > scale of work needed i might have some time to do some coding.

> I've been sending Simon and Marek some emails lately about some
> ideas for dynamic originator intervals. Basically the idea was,
> to not use any external devices like GPS (weather station, news
> paper articles, fortune tellers, ...) for detecting the
> dynamicness of the environment and changing the OGM-interval on
> those pieces of information, but that the information about the
> changes of the TQ-values over time should be sufficient to detect
> changing conditions.

That probably does not help my situation when a node goes away. I'm
guessing a node in a vehicle will vanish from the mesh very quickly.
I don't currently know in our situation if it is normal to perform a
shutdown, or if it just drives off.

It is not a general solution, but i did wonder about adding support
for signalling a node is shutting down. It could say broadcast a
BAT_BYE message, 5 times at 20ms intervals. Neighbors who receive such
a message would then take the originator straight out of their tables,
picking another neighbor for the next hop if possible.

Getting the ordering right during shutdown could be interesting. We
need the bat0 interface to be put down before other interfaces it
depends on. So long as the init.d files have the correct priority this
should be possible. 

> If you (or others) might be interested, I could translate those
> ideas to English and post them on our wiki-page. I would love to
> get some more feedback and enhancements to them.

I would be interested. By the way, what language are they currently
in? I read German if that is any help.

    Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-06-25 19:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-24  8:28 [B.A.T.M.A.N.] originator tq_avg oscilations Andrew Lunn
2009-06-24 14:38 ` Marek Lindner
2009-06-24 21:06 ` Linus Lüssing
2009-06-25 19:47   ` Andrew Lunn
  -- strict thread matches above, loose matches on Subject: below --
2009-06-23 15:06 Andrew Lunn
2009-06-23 20:00 ` Marek Lindner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox