Restarting MeshPoint – seeking advice on routing for crisis/disaster scenarios

public inbox for b.a.t.m.a.n@lists.open-mesh.org
 help / color / mirror / Atom feed

* Restarting MeshPoint – seeking advice on routing for crisis/disaster scenarios
@ 2025-12-16 15:37 Valent Turkovic
  2025-12-20 22:43 ` Valent@MeshPoint
  0 siblings, 1 reply; 5+ messages in thread
From: Valent Turkovic @ 2025-12-16 15:37 UTC (permalink / raw)
  To: b.a.t.m.a.n

Hi everyone,

My name is Valent Turkovic.

Between 2015 and 2018 I ran the MeshPoint project – a simple, rugged Wi-Fi hotspot designed to work in very tough conditions.

During the refugee crisis in Croatia we deployed these devices in camps and transit centers, providing internet connectivity for humanitarian organizations such as the Red Cross, UNICEF, IOM, Greenpeace, and many smaller NGOs. Through these deployments, more than 500,000 people were able to stay connected. The same system was also used in flood response and other emergency situations. The project received the “Best Humanitarian Tech of the Year” award at The Europas in 2016.

Unfortunately, financial constraints forced me to pause the project after 2018. It was entirely self-funded, and the prolonged stress eventually led to long-term health issues.

Over the years I stayed in contact with first responders and field teams from organizations such as WFP, UNICEF, the Red Cross, and various NGOs. The feedback has remained consistent: when disasters strike, whether earthquakes, floods, or large-scale displacement, teams still struggle to bring up reliable communications quickly. What they need most is a mesh network that works within minutes, not hours or days, and that continues operating on battery power when infrastructure is down.

I am fully aware that in active conflict zones Wi-Fi can be jammed or restricted, for example due to drone countermeasures. However, there are many other scenarios where Wi-Fi mesh remains extremely valuable: evacuation centers, field hospitals, temporary shelters, flood-affected villages, and coordination points for responders. In these environments, fast, robust, and easy-to-deploy networking makes a very real difference for coordination, family contact, and medical or logistical data sharing.

Because of this, I am now restarting the project as MeshPoint V2. The focus is updated hardware, improved battery life, and even simpler deployment, while keeping the original goal: crisis response and off-grid or underserved communities.

In the original MeshPoint we used Babel. This was largely driven by practical constraints at the time: our deployment tooling was based on Nodewatcher, which was Babel-only. Technically, Babel served us very well. It converged fast, was reliable, and worked nicely for small to medium-sized networks.

At the same time, I am well aware that many community networks and real-world mesh deployments successfully used batman-adv, often through Gluon or custom firmware builds. In larger, more dynamic, or highly mobile topologies typical for crisis scenarios, the layer-2 approach and seamless mobility properties of batman-adv are very attractive, especially when nodes are frequently moved, powered on and off, or replaced in the field.

For MeshPoint V2 I am evaluating batman-adv and would appreciate insights on the following aspects, specifically in the context of crisis and emergency deployments:

Behaviour at larger scale in real deployments
In crisis scenarios networks often start small but can grow quickly as more nodes are deployed by different teams or organizations. We are interested in how batman-adv behaves when scaling to hundreds or more nodes in non-ideal, real-world conditions, without centralized planning and with limited ability for on-site tuning.

Performance in sparse or highly mobile topologies
Nodes in the field are frequently moved, turned off, replaced, or temporarily isolated. Vehicles, backpacks, and mobile command posts constantly change network topology. We are looking for practical experience with how well batman-adv handles frequent topology changes, intermittent links, and sparse node placement without requiring constant manual intervention.

Suitability for battery-powered and intermittently connected nodes
Many nodes run on battery for long periods and may sleep, reboot, or disappear entirely when power is lost. Low overhead, predictable behaviour, and fast recovery after reconnect are essential. We are particularly interested in any known trade-offs between routing performance, control traffic, and power consumption in such environments.

If there is existing work, documented limitations, field experience, or design guidance relevant to these constraints, pointers would be greatly appreciated. The goal is to build a system that field teams can deploy and rely on under stress, without requiring deep networking expertise on site.

Thank you for your time, and thank you to everyone who has contributed to making mesh networking viable outside of labs and into real-world, high-stakes situations.

Best regards,
Valent Turkovic
https://www.meshpointone.com/

Technical specifications of the original MeshPoint (for reference):
https://www.meshpointone.com/technical-specifications/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Restarting MeshPoint – seeking advice on routing for crisis/disaster scenarios
  2025-12-16 15:37 Restarting MeshPoint – seeking advice on routing for crisis/disaster scenarios Valent Turkovic
@ 2025-12-20 22:43 ` Valent@MeshPoint
  2026-01-05  9:07   ` Simon Wunderlich
  0 siblings, 1 reply; 5+ messages in thread
From: Valent@MeshPoint @ 2025-12-20 22:43 UTC (permalink / raw)
  To: b.a.t.m.a.n

Hello,

I wanted to follow up on my previous message. I did not see any replies, 
so I hope it is ok to share one concrete finding from recent testing in 
case it helps the discussion.

To move beyond purely theoretical arguments, I have been running large 
scale tests using meshnet lab
https://github.com/mwarning/meshnet-lab

The main reason for choosing it is that it allows replaying real world 
community network topologies, including Freifunk graphs, instead of 
relying on synthetic grids or ideal meshes. This makes it easier to 
observe behaviour under sparse, asymmetric, and imperfect conditions 
that are closer to what actually gets deployed.

One interesting observation so far is that results can vary 
significantly depending on how nodes are brought up and how control 
plane load interacts with the topology. In other words, the same 
protocol on the same topology can behave very differently depending on 
timing, churn, and scale effects, even when the underlying links are 
identical. This was not obvious to me before testing at this scale.

I am curious whether others here have used meshnet lab or similar 
namespace based emulation tools for BATMAN adv testing, and if so, 
whether your observations matched real deployments closely, or if there 
are known caveats when interpreting the results.

Any feedback or pointers would be appreciated.

Best regards,
Valent


------ Original Message ------
>From "Valent Turkovic" <valent@meshpointone.com>
To b.a.t.m.a.n@lists.open-mesh.org
Date 16.12.2025. 16:37:01
Subject Restarting MeshPoint – seeking advice on routing for 
crisis/disaster scenarios

>Hi everyone,
>
>My name is Valent Turkovic.
>
>Between 2015 and 2018 I ran the MeshPoint project – a simple, rugged Wi-Fi hotspot designed to work in very tough conditions.
>
>During the refugee crisis in Croatia we deployed these devices in camps and transit centers, providing internet connectivity for humanitarian organizations such as the Red Cross, UNICEF, IOM, Greenpeace, and many smaller NGOs. Through these deployments, more than 500,000 people were able to stay connected. The same system was also used in flood response and other emergency situations. The project received the “Best Humanitarian Tech of the Year” award at The Europas in 2016.
>
>Unfortunately, financial constraints forced me to pause the project after 2018. It was entirely self-funded, and the prolonged stress eventually led to long-term health issues.
>
>Over the years I stayed in contact with first responders and field teams from organizations such as WFP, UNICEF, the Red Cross, and various NGOs. The feedback has remained consistent: when disasters strike, whether earthquakes, floods, or large-scale displacement, teams still struggle to bring up reliable communications quickly. What they need most is a mesh network that works within minutes, not hours or days, and that continues operating on battery power when infrastructure is down.
>
>I am fully aware that in active conflict zones Wi-Fi can be jammed or restricted, for example due to drone countermeasures. However, there are many other scenarios where Wi-Fi mesh remains extremely valuable: evacuation centers, field hospitals, temporary shelters, flood-affected villages, and coordination points for responders. In these environments, fast, robust, and easy-to-deploy networking makes a very real difference for coordination, family contact, and medical or logistical data sharing.
>
>Because of this, I am now restarting the project as MeshPoint V2. The focus is updated hardware, improved battery life, and even simpler deployment, while keeping the original goal: crisis response and off-grid or underserved communities.
>
>In the original MeshPoint we used Babel. This was largely driven by practical constraints at the time: our deployment tooling was based on Nodewatcher, which was Babel-only. Technically, Babel served us very well. It converged fast, was reliable, and worked nicely for small to medium-sized networks.
>
>At the same time, I am well aware that many community networks and real-world mesh deployments successfully used batman-adv, often through Gluon or custom firmware builds. In larger, more dynamic, or highly mobile topologies typical for crisis scenarios, the layer-2 approach and seamless mobility properties of batman-adv are very attractive, especially when nodes are frequently moved, powered on and off, or replaced in the field.
>
>For MeshPoint V2 I am evaluating batman-adv and would appreciate insights on the following aspects, specifically in the context of crisis and emergency deployments:
>
>Behaviour at larger scale in real deployments
>In crisis scenarios networks often start small but can grow quickly as more nodes are deployed by different teams or organizations. We are interested in how batman-adv behaves when scaling to hundreds or more nodes in non-ideal, real-world conditions, without centralized planning and with limited ability for on-site tuning.
>
>Performance in sparse or highly mobile topologies
>Nodes in the field are frequently moved, turned off, replaced, or temporarily isolated. Vehicles, backpacks, and mobile command posts constantly change network topology. We are looking for practical experience with how well batman-adv handles frequent topology changes, intermittent links, and sparse node placement without requiring constant manual intervention.
>
>Suitability for battery-powered and intermittently connected nodes
>Many nodes run on battery for long periods and may sleep, reboot, or disappear entirely when power is lost. Low overhead, predictable behaviour, and fast recovery after reconnect are essential. We are particularly interested in any known trade-offs between routing performance, control traffic, and power consumption in such environments.
>
>If there is existing work, documented limitations, field experience, or design guidance relevant to these constraints, pointers would be greatly appreciated. The goal is to build a system that field teams can deploy and rely on under stress, without requiring deep networking expertise on site.
>
>Thank you for your time, and thank you to everyone who has contributed to making mesh networking viable outside of labs and into real-world, high-stakes situations.
>
>Best regards,
>Valent Turkovic
>https://www.meshpointone.com/
>
>Technical specifications of the original MeshPoint (for reference):
>https://www.meshpointone.com/technical-specifications/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Restarting MeshPoint – seeking advice on routing for crisis/disaster scenarios
  2025-12-20 22:43 ` Valent@MeshPoint
@ 2026-01-05  9:07   ` Simon Wunderlich
  2026-01-05 12:12     ` Re[2]: " Valent@MeshPoint
  0 siblings, 1 reply; 5+ messages in thread
From: Simon Wunderlich @ 2026-01-05  9:07 UTC (permalink / raw)
  To: b.a.t.m.a.n; +Cc: Valent@MeshPoint

Hi Valent,

thank you for your interest and sorry for the late reply. The time before 
Christmas is usually a bit hectic ...

I would suggest to look into the "gluon" Freifunk Firmware [1], including the 
batman-adv parameters made there. There are setups with a couple of hundred 
nodes, although some sparsely connected over cities. Those setups have been 
used and tested for a long time on different types of hardware.

A few general suggestions for tuning for those scenarios are:

* set up a high multicast rate, at least 12 MBit/s, perhaps 24 or more. You 
will trade scalability with range

* choose a higher than default OGM interval, e.g. 5 seconds instead of 1 
second. This makes batman-adv reaction time slower, but helps scaling with 
many nodes. Each node would repeat any other nodes OGM messages, which results 
in O(N^2) OGM messages per interval.

* if you don't need encryption (SAE), turn it off. SAE by default does a peer-
to-peer handshake, which can kill a dense network with many participants in 
one place, if everyone wants to handshake with everyone else at the same time.

There are a few more things (e.g. reducing basic rates) which you may find in 
the gluon firmware and other places. 

As you can see, some of those suggestions are more Wi-Fi layer specific than 
batman-adv specific, and would help with other protocols (e.g. babel) as well. 
From my experience with network simulators/emulators, you may verify protocol 
specific behavior (e.g. number of messages, failover time) to some extent. But 
testing Wi-Fi specific scaling effects like failing SAE handshakes, effects of 
multicast rates, etc is rather hard - even if you use emulators based on 
mac80211_hwsim or so which partially emulate 802.11. For those experiments, 
it's best to actually set up 10-20 devices ...

Cheers,
        Simon

[1] https://gluon.readthedocs.io/en/latest/

On Saturday, December 20, 2025 11:43:20 PM Central European Standard Time 
Valent@MeshPoint wrote:
> Hello,
> 
> I wanted to follow up on my previous message. I did not see any replies,
> so I hope it is ok to share one concrete finding from recent testing in
> case it helps the discussion.
> 
> To move beyond purely theoretical arguments, I have been running large
> scale tests using meshnet lab
> https://github.com/mwarning/meshnet-lab
> 
> The main reason for choosing it is that it allows replaying real world
> community network topologies, including Freifunk graphs, instead of
> relying on synthetic grids or ideal meshes. This makes it easier to
> observe behaviour under sparse, asymmetric, and imperfect conditions
> that are closer to what actually gets deployed.
> 
> One interesting observation so far is that results can vary
> significantly depending on how nodes are brought up and how control
> plane load interacts with the topology. In other words, the same
> protocol on the same topology can behave very differently depending on
> timing, churn, and scale effects, even when the underlying links are
> identical. This was not obvious to me before testing at this scale.
> 
> I am curious whether others here have used meshnet lab or similar
> namespace based emulation tools for BATMAN adv testing, and if so,
> whether your observations matched real deployments closely, or if there
> are known caveats when interpreting the results.
> 
> Any feedback or pointers would be appreciated.
> 
> Best regards,
> Valent
> 
> 
> ------ Original Message ------
> 
> >From "Valent Turkovic" <valent@meshpointone.com>
> 
> To b.a.t.m.a.n@lists.open-mesh.org
> Date 16.12.2025. 16:37:01
> Subject Restarting MeshPoint – seeking advice on routing for
> crisis/disaster scenarios
> 
> >Hi everyone,
> >
> >My name is Valent Turkovic.
> >
> >Between 2015 and 2018 I ran the MeshPoint project – a simple, rugged Wi-Fi
> >hotspot designed to work in very tough conditions.
> >
> >During the refugee crisis in Croatia we deployed these devices in camps and
> >transit centers, providing internet connectivity for humanitarian
> >organizations such as the Red Cross, UNICEF, IOM, Greenpeace, and many
> >smaller NGOs. Through these deployments, more than 500,000 people were
> >able to stay connected. The same system was also used in flood response
> >and other emergency situations. The project received the “Best
> >Humanitarian Tech of the Year” award at The Europas in 2016.
> >
> >Unfortunately, financial constraints forced me to pause the project after
> >2018. It was entirely self-funded, and the prolonged stress eventually led
> >to long-term health issues.
> >
> >Over the years I stayed in contact with first responders and field teams
> >from organizations such as WFP, UNICEF, the Red Cross, and various NGOs.
> >The feedback has remained consistent: when disasters strike, whether
> >earthquakes, floods, or large-scale displacement, teams still struggle to
> >bring up reliable communications quickly. What they need most is a mesh
> >network that works within minutes, not hours or days, and that continues
> >operating on battery power when infrastructure is down.
> >
> >I am fully aware that in active conflict zones Wi-Fi can be jammed or
> >restricted, for example due to drone countermeasures. However, there are
> >many other scenarios where Wi-Fi mesh remains extremely valuable:
> >evacuation centers, field hospitals, temporary shelters, flood-affected
> >villages, and coordination points for responders. In these environments,
> >fast, robust, and easy-to-deploy networking makes a very real difference
> >for coordination, family contact, and medical or logistical data sharing.
> >
> >Because of this, I am now restarting the project as MeshPoint V2. The focus
> >is updated hardware, improved battery life, and even simpler deployment,
> >while keeping the original goal: crisis response and off-grid or
> >underserved communities.
> >
> >In the original MeshPoint we used Babel. This was largely driven by
> >practical constraints at the time: our deployment tooling was based on
> >Nodewatcher, which was Babel-only. Technically, Babel served us very well.
> >It converged fast, was reliable, and worked nicely for small to
> >medium-sized networks.
> >
> >At the same time, I am well aware that many community networks and
> >real-world mesh deployments successfully used batman-adv, often through
> >Gluon or custom firmware builds. In larger, more dynamic, or highly mobile
> >topologies typical for crisis scenarios, the layer-2 approach and seamless
> >mobility properties of batman-adv are very attractive, especially when
> >nodes are frequently moved, powered on and off, or replaced in the field.
> >
> >For MeshPoint V2 I am evaluating batman-adv and would appreciate insights
> >on the following aspects, specifically in the context of crisis and
> >emergency deployments:
> >
> >Behaviour at larger scale in real deployments
> >In crisis scenarios networks often start small but can grow quickly as more
> >nodes are deployed by different teams or organizations. We are interested
> >in how batman-adv behaves when scaling to hundreds or more nodes in
> >non-ideal, real-world conditions, without centralized planning and with
> >limited ability for on-site tuning.
> >
> >Performance in sparse or highly mobile topologies
> >Nodes in the field are frequently moved, turned off, replaced, or
> >temporarily isolated. Vehicles, backpacks, and mobile command posts
> >constantly change network topology. We are looking for practical
> >experience with how well batman-adv handles frequent topology changes,
> >intermittent links, and sparse node placement without requiring constant
> >manual intervention.
> >
> >Suitability for battery-powered and intermittently connected nodes
> >Many nodes run on battery for long periods and may sleep, reboot, or
> >disappear entirely when power is lost. Low overhead, predictable
> >behaviour, and fast recovery after reconnect are essential. We are
> >particularly interested in any known trade-offs between routing
> >performance, control traffic, and power consumption in such environments.
> >
> >If there is existing work, documented limitations, field experience, or
> >design guidance relevant to these constraints, pointers would be greatly
> >appreciated. The goal is to build a system that field teams can deploy and
> >rely on under stress, without requiring deep networking expertise on site.
> >
> >Thank you for your time, and thank you to everyone who has contributed to
> >making mesh networking viable outside of labs and into real-world,
> >high-stakes situations.
> >
> >Best regards,
> >Valent Turkovic
> >https://www.meshpointone.com/
> >
> >Technical specifications of the original MeshPoint (for reference):
> >https://www.meshpointone.com/technical-specifications/





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re[2]: Restarting MeshPoint – seeking advice on routing for crisis/disaster scenarios
  2026-01-05  9:07   ` Simon Wunderlich
@ 2026-01-05 12:12     ` Valent@MeshPoint
  2026-01-05 13:13       ` Simon Wunderlich
  0 siblings, 1 reply; 5+ messages in thread
From: Valent@MeshPoint @ 2026-01-05 12:12 UTC (permalink / raw)
  To: Simon Wunderlich, b.a.t.m.a.n

Hi Simon,

Thank you very much for the detailed reply and the practical 
suggestions. The timing is no problem at all – I completely understand 
how busy things get around the holidays.

I should mention some background: I was an active member of wlan 
slovenija and the Otvorena mreža (Open Network) project in Croatia. We 
are now restarting nodewatcher – our system for node monitoring and 
firmware generation for community networks in Croatia and Slovenia.

Our approach is a bit different from Gluon. Instead of a unified 
firmware image, nodewatcher generates custom firmware per node with all 
parameters pre-configured: subnets, channel assignments, interface 
roles, etc. This lets us handle complexity on the backend so that end 
users just flash the image and everything works – no wizard, no 
configuration choices that might confuse home users. We will certainly 
look at Gluon's technical choices for batman-adv tuning, but we prefer 
this 'keep it simple' deployment model.

Some history: when building the mesh networks in Slovenia and Croatia, 
we started with OLSR. It worked well initially, but once we crossed ~300 
nodes we hit serious scaling limits. Around that time (6-7 years ago) we 
were aware that Freifunk communities in Germany were also experiencing 
scaling issues with batman-adv, so we chose to migrate to Babel instead. 
Babel served us well and we never looked back.

Now, as we restart MeshPoint and consider protocol options again, I am 
genuinely curious:

1. How has batman-adv addressed the scaling problem over the past 7 
years? Since it operates at L2, there is inherently more broadcast 
traffic. Do larger Freifunk networks segment into smaller batman-adv 
domains connected via something else, or has the protocol itself 
improved to handle hundreds of nodes in a single domain?

2. Are there established patterns for combining batman-adv with overlay 
networks or L3 routing? For example, batman-adv for local mesh segments 
with BGP or Babel connecting segments at gateways?

3. For mobile crisis deployments where topology changes constantly, is 
pure batman-adv still recommended, or do experienced operators use 
hybrid approaches?

These are deeper architectural questions – I understand if the answers 
are 'it depends' or require longer discussion. Any pointers to 
documentation, mailing list threads, or real-world deployment writeups 
would be very helpful.

Thank you again for your time and the work you and the team have put 
into batman-adv over the years.

Best regards,
Valent


------ Original Message ------
>From "Simon Wunderlich" <sw@simonwunderlich.de>
To b.a.t.m.a.n@lists.open-mesh.org
Cc "Valent@MeshPoint" <valent@meshpointone.com>
Date 5.1.2026. 10:07:44
Subject Re: Restarting MeshPoint – seeking advice on routing for 
crisis/disaster scenarios

>Hi Valent,
>
>thank you for your interest and sorry for the late reply. The time before
>Christmas is usually a bit hectic ...
>
>I would suggest to look into the "gluon" Freifunk Firmware [1], including the
>batman-adv parameters made there. There are setups with a couple of hundred
>nodes, although some sparsely connected over cities. Those setups have been
>used and tested for a long time on different types of hardware.
>
>A few general suggestions for tuning for those scenarios are:
>
>* set up a high multicast rate, at least 12 MBit/s, perhaps 24 or more. You
>will trade scalability with range
>
>* choose a higher than default OGM interval, e.g. 5 seconds instead of 1
>second. This makes batman-adv reaction time slower, but helps scaling with
>many nodes. Each node would repeat any other nodes OGM messages, which results
>in O(N^2) OGM messages per interval.
>
>* if you don't need encryption (SAE), turn it off. SAE by default does a peer-
>to-peer handshake, which can kill a dense network with many participants in
>one place, if everyone wants to handshake with everyone else at the same time.
>
>There are a few more things (e.g. reducing basic rates) which you may find in
>the gluon firmware and other places.
>
>As you can see, some of those suggestions are more Wi-Fi layer specific than
>batman-adv specific, and would help with other protocols (e.g. babel) as well.
>From my experience with network simulators/emulators, you may verify protocol
>specific behavior (e.g. number of messages, failover time) to some extent. But
>testing Wi-Fi specific scaling effects like failing SAE handshakes, effects of
>multicast rates, etc is rather hard - even if you use emulators based on
>mac80211_hwsim or so which partially emulate 802.11. For those experiments,
>it's best to actually set up 10-20 devices ...
>
>Cheers,
>         Simon
>
>[1] https://gluon.readthedocs.io/en/latest/
>
>On Saturday, December 20, 2025 11:43:20 PM Central European Standard Time
>Valent@MeshPoint wrote:
>>  Hello,
>>
>>  I wanted to follow up on my previous message. I did not see any replies,
>>  so I hope it is ok to share one concrete finding from recent testing in
>>  case it helps the discussion.
>>
>>  To move beyond purely theoretical arguments, I have been running large
>>  scale tests using meshnet lab
>>  https://github.com/mwarning/meshnet-lab
>>
>>  The main reason for choosing it is that it allows replaying real world
>>  community network topologies, including Freifunk graphs, instead of
>>  relying on synthetic grids or ideal meshes. This makes it easier to
>>  observe behaviour under sparse, asymmetric, and imperfect conditions
>>  that are closer to what actually gets deployed.
>>
>>  One interesting observation so far is that results can vary
>>  significantly depending on how nodes are brought up and how control
>>  plane load interacts with the topology. In other words, the same
>>  protocol on the same topology can behave very differently depending on
>>  timing, churn, and scale effects, even when the underlying links are
>>  identical. This was not obvious to me before testing at this scale.
>>
>>  I am curious whether others here have used meshnet lab or similar
>>  namespace based emulation tools for BATMAN adv testing, and if so,
>>  whether your observations matched real deployments closely, or if there
>>  are known caveats when interpreting the results.
>>
>>  Any feedback or pointers would be appreciated.
>>
>>  Best regards,
>>  Valent
>>
>>
>>  ------ Original Message ------
>>
>>  >From "Valent Turkovic" <valent@meshpointone.com>
>>
>>  To b.a.t.m.a.n@lists.open-mesh.org
>>  Date 16.12.2025. 16:37:01
>>  Subject Restarting MeshPoint – seeking advice on routing for
>>  crisis/disaster scenarios
>>
>>  >Hi everyone,
>>  >
>>  >My name is Valent Turkovic.
>>  >
>>  >Between 2015 and 2018 I ran the MeshPoint project – a simple, rugged Wi-Fi
>>  >hotspot designed to work in very tough conditions.
>>  >
>>  >During the refugee crisis in Croatia we deployed these devices in camps and
>>  >transit centers, providing internet connectivity for humanitarian
>>  >organizations such as the Red Cross, UNICEF, IOM, Greenpeace, and many
>>  >smaller NGOs. Through these deployments, more than 500,000 people were
>>  >able to stay connected. The same system was also used in flood response
>>  >and other emergency situations. The project received the “Best
>>  >Humanitarian Tech of the Year” award at The Europas in 2016.
>>  >
>>  >Unfortunately, financial constraints forced me to pause the project after
>>  >2018. It was entirely self-funded, and the prolonged stress eventually led
>>  >to long-term health issues.
>>  >
>>  >Over the years I stayed in contact with first responders and field teams
>>  >from organizations such as WFP, UNICEF, the Red Cross, and various NGOs.
>>  >The feedback has remained consistent: when disasters strike, whether
>>  >earthquakes, floods, or large-scale displacement, teams still struggle to
>>  >bring up reliable communications quickly. What they need most is a mesh
>>  >network that works within minutes, not hours or days, and that continues
>>  >operating on battery power when infrastructure is down.
>>  >
>>  >I am fully aware that in active conflict zones Wi-Fi can be jammed or
>>  >restricted, for example due to drone countermeasures. However, there are
>>  >many other scenarios where Wi-Fi mesh remains extremely valuable:
>>  >evacuation centers, field hospitals, temporary shelters, flood-affected
>>  >villages, and coordination points for responders. In these environments,
>>  >fast, robust, and easy-to-deploy networking makes a very real difference
>>  >for coordination, family contact, and medical or logistical data sharing.
>>  >
>>  >Because of this, I am now restarting the project as MeshPoint V2. The focus
>>  >is updated hardware, improved battery life, and even simpler deployment,
>>  >while keeping the original goal: crisis response and off-grid or
>>  >underserved communities.
>>  >
>>  >In the original MeshPoint we used Babel. This was largely driven by
>>  >practical constraints at the time: our deployment tooling was based on
>>  >Nodewatcher, which was Babel-only. Technically, Babel served us very well.
>>  >It converged fast, was reliable, and worked nicely for small to
>>  >medium-sized networks.
>>  >
>>  >At the same time, I am well aware that many community networks and
>>  >real-world mesh deployments successfully used batman-adv, often through
>>  >Gluon or custom firmware builds. In larger, more dynamic, or highly mobile
>>  >topologies typical for crisis scenarios, the layer-2 approach and seamless
>>  >mobility properties of batman-adv are very attractive, especially when
>>  >nodes are frequently moved, powered on and off, or replaced in the field.
>>  >
>>  >For MeshPoint V2 I am evaluating batman-adv and would appreciate insights
>>  >on the following aspects, specifically in the context of crisis and
>>  >emergency deployments:
>>  >
>>  >Behaviour at larger scale in real deployments
>>  >In crisis scenarios networks often start small but can grow quickly as more
>>  >nodes are deployed by different teams or organizations. We are interested
>>  >in how batman-adv behaves when scaling to hundreds or more nodes in
>>  >non-ideal, real-world conditions, without centralized planning and with
>>  >limited ability for on-site tuning.
>>  >
>>  >Performance in sparse or highly mobile topologies
>>  >Nodes in the field are frequently moved, turned off, replaced, or
>>  >temporarily isolated. Vehicles, backpacks, and mobile command posts
>>  >constantly change network topology. We are looking for practical
>>  >experience with how well batman-adv handles frequent topology changes,
>>  >intermittent links, and sparse node placement without requiring constant
>>  >manual intervention.
>>  >
>>  >Suitability for battery-powered and intermittently connected nodes
>>  >Many nodes run on battery for long periods and may sleep, reboot, or
>>  >disappear entirely when power is lost. Low overhead, predictable
>>  >behaviour, and fast recovery after reconnect are essential. We are
>>  >particularly interested in any known trade-offs between routing
>>  >performance, control traffic, and power consumption in such environments.
>>  >
>>  >If there is existing work, documented limitations, field experience, or
>>  >design guidance relevant to these constraints, pointers would be greatly
>>  >appreciated. The goal is to build a system that field teams can deploy and
>>  >rely on under stress, without requiring deep networking expertise on site.
>>  >
>>  >Thank you for your time, and thank you to everyone who has contributed to
>>  >making mesh networking viable outside of labs and into real-world,
>>  >high-stakes situations.
>>  >
>>  >Best regards,
>>  >Valent Turkovic
>>  >https://www.meshpointone.com/
>>  >
>>  >Technical specifications of the original MeshPoint (for reference):
>>  >https://www.meshpointone.com/technical-specifications/
>
>
>
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Restarting MeshPoint – seeking advice on routing for crisis/disaster scenarios
  2026-01-05 12:12     ` Re[2]: " Valent@MeshPoint
@ 2026-01-05 13:13       ` Simon Wunderlich
  0 siblings, 0 replies; 5+ messages in thread
From: Simon Wunderlich @ 2026-01-05 13:13 UTC (permalink / raw)
  To: b.a.t.m.a.n, Valent@MeshPoint

On Monday, January 5, 2026 1:12:43 PM Central European Standard Time 
Valent@MeshPoint wrote:
> Hi Simon,
> 
> Thank you very much for the detailed reply and the practical
> suggestions. The timing is no problem at all – I completely understand
> how busy things get around the holidays.
> 
> I should mention some background: I was an active member of wlan
> slovenija and the Otvorena mreža (Open Network) project in Croatia. We
> are now restarting nodewatcher – our system for node monitoring and
> firmware generation for community networks in Croatia and Slovenia.

Ah nice! I remember the name nodewatcher from WLAN Slovenija, although I must 
admit I haven't used nodewatcher so far. I think Mitar and others were talking 
about it in the past ... Since battlemesh in croatia is coming up for 2026, 
now is a good time to revive. :) We might have met in Battlemesh v8 in 
Maribor, but I must admit, my memory is a bit blurry ...

> 
> Our approach is a bit different from Gluon. Instead of a unified
> firmware image, nodewatcher generates custom firmware per node with all
> parameters pre-configured: subnets, channel assignments, interface
> roles, etc. This lets us handle complexity on the backend so that end
> users just flash the image and everything works – no wizard, no
> configuration choices that might confuse home users. We will certainly
> look at Gluon's technical choices for batman-adv tuning, but we prefer
> this 'keep it simple' deployment model.

OK, that's interesting. Gluon is also made for simplicity, you can get away 
with just flashing the firmware and no further configuration. However, typically 
you may want to enter the setup mode to set coordinates and a name of your AP, 
though. IPs/subnets are automatically assigned from a centralized VPN server. 
Either way, keeping it simple is definitely a good idea.

> 
> Some history: when building the mesh networks in Slovenia and Croatia,
> we started with OLSR. It worked well initially, but once we crossed ~300
> nodes we hit serious scaling limits. Around that time (6-7 years ago) we
> were aware that Freifunk communities in Germany were also experiencing
> scaling issues with batman-adv, so we chose to migrate to Babel instead.
> Babel served us well and we never looked back.
> 
> Now, as we restart MeshPoint and consider protocol options again, I am
> genuinely curious:
> 
> 1. How has batman-adv addressed the scaling problem over the past 7
> years? Since it operates at L2, there is inherently more broadcast
> traffic. Do larger Freifunk networks segment into smaller batman-adv
> domains connected via something else, or has the protocol itself
> improved to handle hundreds of nodes in a single domain?

There are a few mechanisms in batman-adv, such as DAT (distributed ARP tables) 
and multicast extensions to keep traffic at an acceptable limit. There are also 
some changes to only send broadcasts only once on VPN links, or omit re-
broadcasts entirely those links.

However, the general scaling issues still apply. Therefore, gluon applies a 
few parameters as outlined in my last mail. Hundreds of nodes are possible 
(we've operated ~300 in our local Freifunk community), but with those numbers 
there is quite a lot of "background" noise. Since we had a few slow DSL links 
within our VPN servers, we saw a bottleneck there already ...

In our Freifunk community, we segment into smaller domains (per city/town), 
and I think others do the same - we are targeting up to ~100 -150 nodes per 
segment.

Gluon also applies various firewall rules to avoid service discovery (e.g. 
Avahi), Multicast DNS, etc over the mesh network, which are often way too 
chatty with a few hundreds or even thousands of users.

> 2. Are there established patterns for combining batman-adv with overlay
> networks or L3 routing? For example, batman-adv for local mesh segments
> with BGP or Babel connecting segments at gateways?

We use a couple of VPN servers which run DHCP with different subnet per segment 
(e.g. per town/city). Subnets are connected with each other using layer 3 
routing on those VPN servers. We use bird to for the IP routing part, and GRE 
tunnels to connect to our upstream Internet.

> 
> 3. For mobile crisis deployments where topology changes constantly, is
> pure batman-adv still recommended, or do experienced operators use
> hybrid approaches?

That I can't answer, maybe someone else has experience. :) Setting up the 
layer 3 network requires quite a bit of engineering I would say.

> 
> These are deeper architectural questions – I understand if the answers
> are 'it depends' or require longer discussion. Any pointers to
> documentation, mailing list threads, or real-world deployment writeups
> would be very helpful.
> 
> Thank you again for your time and the work you and the team have put
> into batman-adv over the years.

Those answers were mostly around gluon and our local Freifunk network 
(Freifunk Vogtland [1]). Since this is open source, you can easily review and 
perhaps adopt some parts for your case. Other gluon and Freifunk networks may 
operate differently. Perhaps there are other mailing list readers who want to 
chime in with other projects. :)

Cheers,
       Simon

[1] https://github.com/FreifunkVogtland

> 
> Best regards,
> Valent
> 
> 
> ------ Original Message ------
> From "Simon Wunderlich" <sw@simonwunderlich.de>
> To b.a.t.m.a.n@lists.open-mesh.org
> Cc "Valent@MeshPoint" <valent@meshpointone.com>
> Date 5.1.2026. 10:07:44
> Subject Re: Restarting MeshPoint – seeking advice on routing for
> crisis/disaster scenarios
> 
> >Hi Valent,
> >
> >thank you for your interest and sorry for the late reply. The time before
> >Christmas is usually a bit hectic ...
> >
> >I would suggest to look into the "gluon" Freifunk Firmware [1], including
> >the batman-adv parameters made there. There are setups with a couple of
> >hundred nodes, although some sparsely connected over cities. Those setups
> >have been used and tested for a long time on different types of hardware.
> >
> >A few general suggestions for tuning for those scenarios are:
> >
> >* set up a high multicast rate, at least 12 MBit/s, perhaps 24 or more. You
> >will trade scalability with range
> >
> >* choose a higher than default OGM interval, e.g. 5 seconds instead of 1
> >second. This makes batman-adv reaction time slower, but helps scaling with
> >many nodes. Each node would repeat any other nodes OGM messages, which
> >results in O(N^2) OGM messages per interval.
> >
> >* if you don't need encryption (SAE), turn it off. SAE by default does a
> >peer- to-peer handshake, which can kill a dense network with many
> >participants in one place, if everyone wants to handshake with everyone
> >else at the same time.
> >
> >There are a few more things (e.g. reducing basic rates) which you may find
> >in the gluon firmware and other places.
> >
> >As you can see, some of those suggestions are more Wi-Fi layer specific
> >than batman-adv specific, and would help with other protocols (e.g. babel)
> >as well. From my experience with network simulators/emulators, you may
> >verify protocol specific behavior (e.g. number of messages, failover time)
> >to some extent. But testing Wi-Fi specific scaling effects like failing
> >SAE handshakes, effects of multicast rates, etc is rather hard - even if
> >you use emulators based on mac80211_hwsim or so which partially emulate
> >802.11. For those experiments, it's best to actually set up 10-20 devices
> >...
> >
> >Cheers,
> >
> >         Simon
> >
> >[1] https://gluon.readthedocs.io/en/latest/
> >
> >On Saturday, December 20, 2025 11:43:20 PM Central European Standard Time
> >
> >Valent@MeshPoint wrote:
> >>  Hello,
> >>  
> >>  I wanted to follow up on my previous message. I did not see any replies,
> >>  so I hope it is ok to share one concrete finding from recent testing in
> >>  case it helps the discussion.
> >>  
> >>  To move beyond purely theoretical arguments, I have been running large
> >>  scale tests using meshnet lab
> >>  https://github.com/mwarning/meshnet-lab
> >>  
> >>  The main reason for choosing it is that it allows replaying real world
> >>  community network topologies, including Freifunk graphs, instead of
> >>  relying on synthetic grids or ideal meshes. This makes it easier to
> >>  observe behaviour under sparse, asymmetric, and imperfect conditions
> >>  that are closer to what actually gets deployed.
> >>  
> >>  One interesting observation so far is that results can vary
> >>  significantly depending on how nodes are brought up and how control
> >>  plane load interacts with the topology. In other words, the same
> >>  protocol on the same topology can behave very differently depending on
> >>  timing, churn, and scale effects, even when the underlying links are
> >>  identical. This was not obvious to me before testing at this scale.
> >>  
> >>  I am curious whether others here have used meshnet lab or similar
> >>  namespace based emulation tools for BATMAN adv testing, and if so,
> >>  whether your observations matched real deployments closely, or if there
> >>  are known caveats when interpreting the results.
> >>  
> >>  Any feedback or pointers would be appreciated.
> >>  
> >>  Best regards,
> >>  Valent
> >>  
> >>  
> >>  ------ Original Message ------
> >>  
> >>  >From "Valent Turkovic" <valent@meshpointone.com>
> >>  
> >>  To b.a.t.m.a.n@lists.open-mesh.org
> >>  Date 16.12.2025. 16:37:01
> >>  Subject Restarting MeshPoint – seeking advice on routing for
> >>  crisis/disaster scenarios
> >>  
> >>  >Hi everyone,
> >>  >
> >>  >My name is Valent Turkovic.
> >>  >
> >>  >Between 2015 and 2018 I ran the MeshPoint project – a simple, rugged
> >>  >Wi-Fi
> >>  >hotspot designed to work in very tough conditions.
> >>  >
> >>  >During the refugee crisis in Croatia we deployed these devices in camps
> >>  >and
> >>  >transit centers, providing internet connectivity for humanitarian
> >>  >organizations such as the Red Cross, UNICEF, IOM, Greenpeace, and many
> >>  >smaller NGOs. Through these deployments, more than 500,000 people were
> >>  >able to stay connected. The same system was also used in flood response
> >>  >and other emergency situations. The project received the “Best
> >>  >Humanitarian Tech of the Year” award at The Europas in 2016.
> >>  >
> >>  >Unfortunately, financial constraints forced me to pause the project
> >>  >after
> >>  >2018. It was entirely self-funded, and the prolonged stress eventually
> >>  >led
> >>  >to long-term health issues.
> >>  >
> >>  >Over the years I stayed in contact with first responders and field
> >>  >teams
> >>  >from organizations such as WFP, UNICEF, the Red Cross, and various
> >>  >NGOs.
> >>  >The feedback has remained consistent: when disasters strike, whether
> >>  >earthquakes, floods, or large-scale displacement, teams still struggle
> >>  >to
> >>  >bring up reliable communications quickly. What they need most is a mesh
> >>  >network that works within minutes, not hours or days, and that
> >>  >continues
> >>  >operating on battery power when infrastructure is down.
> >>  >
> >>  >I am fully aware that in active conflict zones Wi-Fi can be jammed or
> >>  >restricted, for example due to drone countermeasures. However, there
> >>  >are
> >>  >many other scenarios where Wi-Fi mesh remains extremely valuable:
> >>  >evacuation centers, field hospitals, temporary shelters, flood-affected
> >>  >villages, and coordination points for responders. In these
> >>  >environments,
> >>  >fast, robust, and easy-to-deploy networking makes a very real
> >>  >difference
> >>  >for coordination, family contact, and medical or logistical data
> >>  >sharing.
> >>  >
> >>  >Because of this, I am now restarting the project as MeshPoint V2. The
> >>  >focus
> >>  >is updated hardware, improved battery life, and even simpler
> >>  >deployment,
> >>  >while keeping the original goal: crisis response and off-grid or
> >>  >underserved communities.
> >>  >
> >>  >In the original MeshPoint we used Babel. This was largely driven by
> >>  >practical constraints at the time: our deployment tooling was based on
> >>  >Nodewatcher, which was Babel-only. Technically, Babel served us very
> >>  >well.
> >>  >It converged fast, was reliable, and worked nicely for small to
> >>  >medium-sized networks.
> >>  >
> >>  >At the same time, I am well aware that many community networks and
> >>  >real-world mesh deployments successfully used batman-adv, often through
> >>  >Gluon or custom firmware builds. In larger, more dynamic, or highly
> >>  >mobile
> >>  >topologies typical for crisis scenarios, the layer-2 approach and
> >>  >seamless
> >>  >mobility properties of batman-adv are very attractive, especially when
> >>  >nodes are frequently moved, powered on and off, or replaced in the
> >>  >field.
> >>  >
> >>  >For MeshPoint V2 I am evaluating batman-adv and would appreciate
> >>  >insights
> >>  >on the following aspects, specifically in the context of crisis and
> >>  >emergency deployments:
> >>  >
> >>  >Behaviour at larger scale in real deployments
> >>  >In crisis scenarios networks often start small but can grow quickly as
> >>  >more
> >>  >nodes are deployed by different teams or organizations. We are
> >>  >interested
> >>  >in how batman-adv behaves when scaling to hundreds or more nodes in
> >>  >non-ideal, real-world conditions, without centralized planning and with
> >>  >limited ability for on-site tuning.
> >>  >
> >>  >Performance in sparse or highly mobile topologies
> >>  >Nodes in the field are frequently moved, turned off, replaced, or
> >>  >temporarily isolated. Vehicles, backpacks, and mobile command posts
> >>  >constantly change network topology. We are looking for practical
> >>  >experience with how well batman-adv handles frequent topology changes,
> >>  >intermittent links, and sparse node placement without requiring
> >>  >constant
> >>  >manual intervention.
> >>  >
> >>  >Suitability for battery-powered and intermittently connected nodes
> >>  >Many nodes run on battery for long periods and may sleep, reboot, or
> >>  >disappear entirely when power is lost. Low overhead, predictable
> >>  >behaviour, and fast recovery after reconnect are essential. We are
> >>  >particularly interested in any known trade-offs between routing
> >>  >performance, control traffic, and power consumption in such
> >>  >environments.
> >>  >
> >>  >If there is existing work, documented limitations, field experience, or
> >>  >design guidance relevant to these constraints, pointers would be
> >>  >greatly
> >>  >appreciated. The goal is to build a system that field teams can deploy
> >>  >and
> >>  >rely on under stress, without requiring deep networking expertise on
> >>  >site.
> >>  >
> >>  >Thank you for your time, and thank you to everyone who has contributed
> >>  >to
> >>  >making mesh networking viable outside of labs and into real-world,
> >>  >high-stakes situations.
> >>  >
> >>  >Best regards,
> >>  >Valent Turkovic
> >>  >https://www.meshpointone.com/
> >>  >
> >>  >Technical specifications of the original MeshPoint (for reference):
> >>  >https://www.meshpointone.com/technical-specifications/





^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-01-05 16:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-16 15:37 Restarting MeshPoint – seeking advice on routing for crisis/disaster scenarios Valent Turkovic
2025-12-20 22:43 ` Valent@MeshPoint
2026-01-05  9:07   ` Simon Wunderlich
2026-01-05 12:12     ` Re[2]: " Valent@MeshPoint
2026-01-05 13:13       ` Simon Wunderlich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox