_clock_offset() call on each address.
I used the same 25000 timeout setting as in hcitool.c, as I
discovered from experience that for some reason it isn't possible to
set it any lower without it doing weird things (why is that? Is there a
natural hardware timeout that can't be changed?)
The main problem, however, is that sometimes I get back wrong
names; there is a LOT of bluetooth traffic in this building
(particularly tons of mac-mini's) that frequently respond back to the
scanner, and sometimes the mac mini name comes in place of the real
name of the device I am trying to query. This doesn't happen all the
time, however, but tends to happen when a lot of name queries are
happening simultaneously.
I had seen a similar thread on here written by Avaited
regarding a patch that he had prepared to fix this problem; claiming
that he had found the problem in hci.c. As the patch hasn't surfaced
yet, and assuming that this was my problem, I decided to try the following modifications to hci.c; I made the bdaddr pointer non-const in all functions used by hci_read_remote_name() and made
the small change below assuming that if the wrong name came into
rn.name
, the wrong bdaddr must be stored in
rn.bdaddr as well:
// makes sure received name's bdaddr matches what we're expecting
bacpy(bdaddr, &rn.bdaddr);
I placed this line right above the
strncpy(name, (char *) rn.name, len);
line near the end of the function. After doing this, things seemed to improve.
However, I noticed that usually after a few hours of running the
scanner software hcid would accumulate in CPU time:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1838 0.8 0.3 1960 784 ? Ss 04:01 2:55
/usr/sbin/hcidand
when this happened the frequency of name errors recurred even with the
patch I did, and eventually the dongles would stop scanning altogether.
We've tried everything from resetting the dongles right after a name request times out, to having the sniffer boxes reboot after a few hours, which is something that we really don't want to do as it prevents us from gathering data at certain intervals during the day.
As
far as I can see the code shouldn't cause any problems, though I find
it strange that we have to reset the dongles at all, yet even with the
reset they eventually freeze up anyway; it's just a matter of time. The dongles that we are using
are Belkin F8T013. We are trying to get some CSR dongles instead but
at this point in time the Belkin's are what we have available to us.
Is there a problem with using multiple dongles in this manner
that our software is using them? We would like to use the
multiple-dongle approach so that the names come in faster, but if the
names keep coming in incorrectly then our data can't use the name data
reliably. Is there a bug at the kernel level where the name requests
could get mixed up if too many name requests are happening at the same
time with multiple dongles?
Is it a Broadcom chipset-specific problem?
As an update, today there managed to be a lot of new people in the building, which resulted in lots of new data. Unfortunately the name problem happened again, and even worse; almost every name that came in went to a different btaddr, leading to our database storing incorrect names with other btaddr's. With a ps aux at the terminal I found that this time, the hcid wasn't even high in CPU time, which tells me that Avaited's suspicion may still be the case; that multiple hci_read_remote_name requests in general are the source of the the names getting mixed up?
I apologize that this message has gotten long enough already,
but I've tried to work on this problem for weeks and am hoping that someone could clue me in as to whether or not I'm doing
something wrong, and/or if indeed there is a BlueZ bug regarding this
manner.
Thanks in advance,