Emergency Communications

Playing with speech synthesis

This afternoon, I was pondering about how I might do text-to-speech, but still have the result sound somewhat natural. For what use case? Well, two that come to mind…

The first being for doing “strapper call” announcements at horse endurance rides. A horse endurance ride is where competitors and their horses traverse a long (sometimes as long as 320km) trail through a wilderness area. Usually these rides (particularly the long ones) are broken up into separate stages or “legs”.

Upon arrival back at base, the competitor has a limited amount of time to get the horse’s vital signs into acceptable ranges before they must present to the vet. If the horse has a too-high temperature, or their horse’s heart rate is too high, they are “vetted out”.

When the competitor reaches the final check-point, ideally you want to let that competitor’s support team know they’re on their way back to base so they can be there to meet the competitor and begin their work with the horse.

Historically, this was done over a PA system, however this isn’t always possible for the people at base to achieve. So having an automated mechanism to do this would be great. In recent times, Brisbane WICEN has been developing a public display that people can see real-time results on, and this also doubles as a strapper-call display.

Getting the information to that display is something of a work-in-progress, but it’s recognised that if you miss the message popping up on the display, there’s no repeat. A better solution would be to “read out” the message. Then you don’t have to be watching the screen, you can go about your business. This could be done over a PA system, or at one location there’s an extensive WiFi network there, so streaming via Icecast is possible.

But how do you get the text into speech?

Enter flite

flite is a minimalist speech synthesizer from the Festival project. Out of the box it includes 3 voices, mostly male American voices. (I think the rms one might be Richard M. Stallman, but I could be wrong on that!) There’s a couple of demos there that can be run direct from the command line.

So, for the sake of argument, let’s try something simple, I’ll use the slt voice (a US female voice) and just get the program to read out what might otherwise be read out during a horse ride event:

$ flite_cmu_us_slt -t 'strapper call for the 160 kilometer event competitor numbers 123 and 234' slt-strapper-nopunctuation-digits.wav
slt-strapper-nopunctuation-digits.ogg

Not bad, but not that great either. Specifically, the speech is probably a little quick. The question is, how do you control this? Turns out there’s a bit of hidden functionality.

There is an option marked -ssml which tells flite to interpret the text as SSML. However, if you try it, you may find it does little to improve matters, I don’t think flite actually implements much of it.

Things are improved if we spell everything out. So if you instead replace the digits with words, you do get a better result:

$ flite_cmu_us_slt -t 'strapper call for the one hundred and sixty kilometer event competitor number one two three and two three four' slt-strapper-nopunctuation-words.wav
slt-strapper-nopunctuation-words.ogg

Definitely better. It could use some pauses. Now, we don’t have very fine-grained control over those pauses, but we can introduce some punctuation to have some control nonetheless.

$ flite_cmu_us_slt -t 'strapper call.  for the one hundred and sixty kilometer event.  competitor number one two three and two three four' slt-strapper-punctuation.wav
slt-strapper-punctuation.ogg

Much better. Of course it still sounds somewhat robotic though. I’m not sure how to adjust the cadence on the whole, but presumably we can just feed the text in piece-wise, render those to individual .wav files, then stitch them together with the pauses we want.

How about other changes though? If you look at flite --help, there is feature options which can control the synthesis. There’s no real documentation on what these do, what I’ve found so far was found by grep-ing through the flite source code. Tip: do a grep for feat_set_, and you’ll see a whole heap.

Controlling pitch

There’s two parameters for the pitch… int_f0_target_mean controls the “centre” frequency of the speech in Hertz, and int_f0_target_stddev controls the deviation. For the slt voice, …mean seems to sit around 160Hz and the deviation is about 20Hz.

So we can say, set the frequency to 90Hz and get a lower tone:

$ flite_cmu_us_slt --setf int_f0_target_mean=90 -t 'strapper call' slt-strapper-mean-90.wav
slt-strapper-mean-90.ogg

… or 200Hz for a higher one:

$ flite_cmu_us_slt --setf int_f0_target_mean=200 -t 'strapper call' slt-strapper-mean-200.wav
slt-strapper-mean-200.ogg

… or we can change the variance:

$ flite_cmu_us_slt --setf int_f0_target_stddev=0.0 -t 'strapper call' slt-strapper-stddev-0.wav
$ flite_cmu_us_slt --setf int_f0_target_stddev=70.0 -t 'strapper call' slt-strapper-stddev-70.wav
slt-strapper-stddev-0.ogg
slt-strapper-stddev-70.ogg

We can’t change these values during a block of speech, but presumably we can cut up the text we want to render, render each piece at the frequency/variance we want, then stitch those together.

Controlling rate

So I mentioned we can control the rate, somewhat coarsely using usual punctuation devices. We can also change the rate overall by setting duration_stretch. This basically is a control of how “long” we want to stretch out the pronunciation of words.

$ flite_cmu_us_slt --setf duration_stretch=0.5 -t 'strapper call' slt-strapper-stretch-05.wav
$ flite_cmu_us_slt --setf duration_stretch=0.7 -t 'strapper call' slt-strapper-stretch-07.wav
$ flite_cmu_us_slt --setf duration_stretch=1.0 -t 'strapper call' slt-strapper-stretch-10.wav
$ flite_cmu_us_slt --setf duration_stretch=1.3 -t 'strapper call' slt-strapper-stretch-13.wav
$ flite_cmu_us_slt --setf duration_stretch=2.0 -t 'strapper call' slt-strapper-stretch-20.wav
slt-strapper-stretch-05.ogg
slt-strapper-stretch-07.ogg
slt-strapper-stretch-10.ogg
slt-strapper-stretch-13.ogg
slt-strapper-stretch-20.ogg

Putting it together

So it looks as if all the pieces are there, we just need to stitch them together.

RC=0 stuartl@rikishi /tmp $ flite_cmu_us_slt --setf duration_stretch=1.2 --setf int_f0_target_stddev=50.0 --setf int_f0_target_mean=180.0 -t 'strapper call' slt-strapper-call.wav
RC=0 stuartl@rikishi /tmp $ flite_cmu_us_slt --setf duration_stretch=1.1 --setf int_f0_target_stddev=30.0 --setf int_f0_target_mean=180.0 -t 'for the, one hundred, and sixty kilometer event' slt-160km-event.wav
RC=0 stuartl@rikishi /tmp $ flite_cmu_us_slt --setf duration_stretch=1.4 --setf int_f0_target_stddev=40.0 --setf int_f0_target_mean=180.0 -t 'competitors, one two three, and, two three four' slt-competitors.wav
Above files stitched together in Audacity

Here, I manually imported all three files into Audacity, arranged them, then exported the result, but there’s no reason why the same could not be achieved by a program, I’m just inserting pauses after all.

There are tools for manipulating RIFF waveform files in most languages, and generating silence is not rocket science. The voice itself could be fine-tuned, but that’s simply a matter of tweaking settings. Generating the text is basically a look-up table feeding into snprintf (or its equivalent in your programming language of choice).

It’d be nice to implement a wrapper around flite that took the full SSML or JSML text and rendered it out as speech, but this gets pretty close without writing much code at all. Definitely worth continuing with.

DC Power Distribution

So, lately I’ve been helping out with running the base at a few horse rides up at Imbil. This involves amongst other things, running three radios, a base computer, laptops, and other paraphernalia.

The whole kit needs to run off an unregulated 12V DC supply, consisting of two 105Ah AGM batteries which have solar and mains back-up. The outlet for this is a Anderson SB50 connector, fairly standard for caravans.

Catch being, this is temporary. So no permanent linkages, we need to be able to disconnect and pack everything away when not in use. One bug bear is having enough DC outlets for everything. Especially of the 30A Anderson Power Pole variety, since most of our radios use those.

The monitor for the base computer uses a cigarette lighter adapter, while the base computer itself (an Intel NUC) has a cable terminated with a 30A power pole. There’s also a WiFi router which has a micro-USB power input — thankfully the monitor’s adaptor embeds a USB power outlet, so we can run it off that.

We need two amateur radios (one for voice comms, one for packet), and a CB set for communications with the ride organisers (who are otherwise not licensed to use amateur bands). We may also see a move to commercial frequencies, so that’s potentially another radio or two.

I started thinking about ways we could make a modular power distribution system.

The thought was, if we made PDU boxes where the inlet and outlet were nice big SB50s, configured so that they would mate when the boxes were joined up, we could have a flexible PDU system where we just clip it together like Lego bricks.

This is a work in progress, but I figured I’d post what I have so far.

Power outlets on the distribution box, yet to be wired up.

I still need to do the internal wiring, but above is basically what I was thinking of. There’s room for up to 6 consumers via the 30A power pole connections along one side, each with its own 20A breaker. (The connectors are rated at 45A.)

Originally I was aiming for 6 cigarette lighter sockets, but after receiving the parts, I realised that wouldn’t fit, but two seems to work okay, and we can always make a second box and slap that on the end. Each has a 15A breaker.

Protecting the upstream power source is a 50A breaker. So total of the down-stream port + all outlets on the box itself may not exceed 50A.

The upstream and downstream ports are positioned so that boxes can just be butted up against each-other for the connectors to mate. I’ve got to fine-tune the positioning a bit, and right now the connectors are also on an angle, but this hopefully shows the concept…

The idea for maintenance is the box will fold out. Not sure if the connection between all the outputs on the lid will be via a bus bar or using individual cables going to the tie point inside the box just yet. Those 30A outlets are just begging for a single cable to visit each bus-bar style. I also have to figure out how I’ll connect to the cigarette lighter sockets too.

Hopefully I’ll get this done before the next ride event.

6LoWHAM: Working towards connected-mode operation in aioax25

The past few months have been quiet for this project, largely because Brisbane WICEN has had my spare time soaked up with an RFID system they are developing for tracking horse rides through the Imbil State Forest for the Stirling’s Crossing Endurance Club.

Ultimately, when we have some decent successes, I’ll probably be reporting more on this on WICEN’s website. Suffice to say, it’s very much a work-in-progress, but it has proved a valuable testing ground for aioax25. The messaging system being used is basically just plain APRS messaging, with digipeating thrown in as well.

Since I’ve had a moment to breathe, I’ve started filling out the features in aioax25, starting with connected-mode operation. The thinking is this might be useful for sending larger payloads. APRS messages are limited to a 63 character message size with only a subset of ASCII being permitted.

Thankfully that subset includes all of the Base64 character set, so I’m able to do things like tunnel NTP packets and CBOR blobs through it, so that stations out in the field can pull down configuration settings and the current time.

As for the RFID EPCs, we’re sending those in the canonical hexadecimal format, which works, but the EPC occupies most of the payload size. At 1200 bits per second, this does slow things down quite a bit. We get a slight improvement if we encode the EPCs as Base64. We’d get a 200% efficiency increase if we could send it as binary bytes instead. Sending a CBOR blob that way would be very efficient.

The thinking is that the nodes find each-other via APRS, then once they’ve discovered a path, they can switch to connected mode to send bulk transfers back to base.

Thus, I’ve been digging into connected mode operation. AX.25 2.2 is not the most well-written spec I’ve read. In fact, it is down-right confusing in places. It mixes up little-endian and big-endian fields, certain bits have different meanings in different contexts, and it uses concepts which are “foreign” to someone like myself who’s used to TCP/IP.

Right now I’m making progress, there’s an untested implementation in the connected-mode branch. I’m writing unit test cases based on what I understand the behaviour to be, but somehow I think this is going to need trials with some actual AX.25 implementations such as Direwolf, the Linux kernel stack, G8BPQ stack and the implementation on my Kantronics KPC3 and my Kenwood TH-D72A.

Some things I’m trying to get an answer to:

  • In the address fields at the start of a frame, you have what I’ve been calling the ch bit.
    On digipeater addresses, it’s called H and it is used to indicate that a frame has been digipeated by that digipeater.
    When seen in the source or destination addresses, it is called C, and it describes whether the frame is a “command” frame, or a “response” frame.

    An AX.25 2.x “command” frame sets the destination address’s C bit to 1, and the source address’s C bit to 0, whilst a “response” frame in AX.25 does the opposite (destination C is 0, source C is 1).

    In prior AX.25 versions, they were set identically. Question is, which is which? Is a frame a “command” when both bits are set to 1s and a “response” if both C bits are 0s? (Thankfully, I think my chances of meeting an AX.25 1.x station are very small!)
  • In the Control field, there’s a bit marked P/F (for Poll/Final), and I’ve called it pf in my code. Sometimes this field gets called “Poll”, sometimes it gets called “Final”. It’s not clear on what occasions it gets called “Poll” and when it is called “Final”. It isn’t as simple as assuming that pf=1 means poll and pf=0 means final. Which is which? Who knows?
  • AX.25 2.0 allowed up to 8 digipeaters, but AX.25 2.2 limits it to 2. AX.25 2.2 is supposed to be backward compatible, so what happens when it receives a frame from a digipeater that is more than 2 digipeater hops away? (I’m basically pretending the limitation doesn’t exist right now, so aioax25 will handle 8 digipeaters in AX.25 2.2 mode)
  • The table of PID values (figure 3.2 in the AX.25 2.2 spec) mentions several protocols, including “Link Quality Protocol”. What is that, and where is the spec for it?
  • Is there an “experimental” PID that can be used that says “this is a L3 protocol that is a work in progress” so I don’t blow up someone’s station with traffic they can’t understand? The spec says contact the ARRL, which I have done, we’ll see where that gets me.
  • What do APRS stations do with a PID they don’t recognise? (Hopefully ignore it!)

Right at this point, the Direwolf sources have proven quite handy. Already I am now aware of a potential gotcha with the AX.25 2.0 implementation on the Kantronics KPC3+ and the Kenwood TM-D710.

I suspect my hand-held (Kenwood TH-D72A) might do the same thing as the TM-D710, but given JVC-Kenwood have pulled out of the Australian market, I’m more like to just say F### you Kenwood and ignore the problem since these can do KISS mode, bypassing the buggy AX.25 implementation on a potentially resource-constrained device.

NET/ROM is going to be a whole different ball-game, and yes, that’s on the road map. Long-term, I’d like 6LoWHAM stations to be able to co-exist peacefully with other stations. Much like you can connect to a NET/ROM node using traditional AX.25, then issue connect commands to jump from there to any AX.25 or NET/ROM station; I intend to offer the same “feature” on a 6LoWHAM station — you’ll be able to spin up a service that accepts AX.25 and NET/ROM connections, and allows you to hit any AX.25, NET/ROM or 6LoWHAM station.

I might park the project for a bit, and get back onto the WICEN stuff, as what we have in aioax25 is doing okay, and there’s lots of work to be done on the base software that’ll keep me busy right up to when the horse rides re-start in 2020.

Solar Cluster: Second compute node operational again

So, a few months back I had the failure of one of my storage nodes. Since I need 3 storage nodes to operate, but can get away with a single compute node, I did a board-shuffle. I just evacuated lithium of all its virtual machines, slapped the SSD, HDD and cover from hydrogen in/on it, and it became the new storage node.

Actually I took the opportunity to upgrade to 2TB HDDs at the same time, as well as adding two new storage nodes (Intel NUCs). I then ordered a new motherboard to get lithium back up again. Again, there was an opportunity to upgrade, so ~$1500 later I ordered a SuperMicro A2SDi-16C-HLN4F. 16 cores, and full-size DDR4 DIMMs, so much easier to get bits for. It also takes M.2 SATA.

The new board arrived a few weeks ago, but I was heavily snowed under with activities surrounding Brisbane Area WICEN Group and their efforts to assist the Stirling’s Crossing Endurance Club running the Tom Quilty 2019. So it got shoved to the side with the RAM I had purchased to be dealt with another day.

I found time on Monday to assemble the hardware, then had fun and games with the UEFI firmware on this board. Put simply, the legacy BIOS support on this board is totally and utterly broken. The UEFI shell is also riddled with bugs (e.g. ifconfig help describes how to bring up an interface via DHCP or statically, but doing so fails). And of course, PXE is not PXE when UEFI is involved.

I ended up using Ubuntu’s GRUB binary and netboot image to boot-strap the machine, after which I could copy my Gentoo install back in. I now have the machine back in the rack, and whilst I haven’t deployed any VMs to it yet, I will do so soon. I did however, give it a burn-in test updating the kernel:

  LD [M]  security/keys/encrypted-keys/encrypted-keys.ko
  MKPIGGY arch/x86/boot/compressed/piggy.S
  AS      arch/x86/boot/compressed/piggy.o
  LD      arch/x86/boot/compressed/vmlinux
ld: arch/x86/boot/compressed/head_64.o: warning: relocation in read-only section `.head.text'
ld: warning: creating a DT_TEXTREL in object.
  ZOFFSET arch/x86/boot/zoffset.h
  OBJCOPY arch/x86/boot/vmlinux.bin
  AS      arch/x86/boot/header.o
  LD      arch/x86/boot/setup.elf
  OBJCOPY arch/x86/boot/setup.bin
  BUILD   arch/x86/boot/bzImage
Setup is 16444 bytes (padded to 16896 bytes).
System is 6273 kB
CRC ca5d7cb3
Kernel: arch/x86/boot/bzImage is ready  (#1)

real    7m7.727s
user    62m6.396s
sys     5m8.970s
lithium /usr/src/linux-stable # git describe
v5.1.11

7m for make -j 17 to build a current Linux kernel is not bad at all!

6LoWHAM: AX.25 Python library and addressing thoughts

Lately, I had a need for a library that would talk to a KISS TNC and allow me to exchange UI frames over an AX.25 network.

This is part of a project being undertaken by Brisbane Area WICEN Group. We’ve been tasked with the job of reporting scans from RFID tag readers back to base… and naturally we’ll be using the AX.25 network we’re already familiar with. The plan is to use APRS messaging (to keep things simple) to submit the location, time and hardware address of each RFID read.

For this, I needed something I also need for this project, a tool to encode and decode the UI frames. I had initially thought of just using LinBPQ or similar to provide the interface to AX.25, but in the end, it was easier for me to write my own simple AX.25 stack from scratch.

aioax25 obviously is nowhere near a replacement for other AX.25 stacks in that it only encodes and decodes frames, but it’s a first step in that journey. This library is written for Python 3.4 and up using the asyncio module and pyserial. At the moment I have used it to somewhat crudely send and receive APRS messages, and so with a bit of work, it’ll suffice for the WICEN project.

That does mean I’m not shackled in terms of what bits I can set in my AX.25 headers. One limitation I have with my mapping of 6LoWHAM addresses to AX.25 addresses is that I cannot represent all characters or the “group” bit.

This lead to the limitation that if I defined a group called VK4BWI-0, that group may not have a participant with the call-sign of VK4BWI-0 because I would not be able to differentiate group messages from direct messages.

By writing my own AX.25 stack, I potentially can side-step that limitation: I can utilise the reserved bits in a call-sign/SSID to represent this information. I avoided their use before because the interfaces I planned on using did not expose them, but doing it myself means they’re directly accessible. The AX.25 protocol documentation states:

The bits marked “r” are reserved bits. They may be used in an agreed-upon manner in individual networks. When not implemented, they should be set to one.

https://www.tapr.org/pub_ax25.html

Now, the question is, if I set one to 0, would it reach the far end as a 0? If so, this could be a stand-in for the group bit — stored inverted so that a 1 represents a unicast destination and 0 represents a group.

The other option is to just prepend the left-over bits to the start of the message payload. This has the bonus that I can encode the full-callsign even if that call-sign does not fit in a standard AX.25 message.

So a message sent to VK4FACE-6 (let’s pretend F-calls can use packet for the sake of an example) would be sent to AX.25 SSID VK4FAC-6, and the first few bytes would encode the missing E and the group/unicast bit. If the station VK4FAC were also on frequency, the software stack at their end would need to filter based on those initial payload bytes.

We support 8-character call-signs, so we need to represent 2 left-over characters plus a group bit. Add space for two-more characters for the source call-sign (which may not be a group), we require about 3 bytes.

At this point we might as well use 4, store the extra bytes as 7-bit ASCII, with the spare MSBs of each byte encoding the group bit and one spare bit. An extra 8 bits is bugger all really even at 1200 baud.

Obviously, NET/ROM has no knowledge of this. Stations that are on the other side of a non-6LoWHAM digipeater need to explicitly source-route their hops to reach the rest of a mesh network, and the nodes the other side need to “remember” this source route.

This latter scheme also won’t work for connected mode, as there’s no scope to shoehorn those bytes in the information field and still remain AX.25 compatible — it will only work for 6LoWHAM UI frames.

Anyway, it’s food for thought.

6LoWHAM: Dissecting packets in Python

Recently, I’ve been looking at the problem of how to retrieve IPv6 traffic from the network stack of my workstation and manipulate it for transmission over AX.25.

My last experiments focussed on the TUN/TAP interface in Linux. Using this interface, I could create a virtual network interface that piped its traffic to a file descriptor in a program written in C.

One advantage of using the C language for this is that, as binding to the TAP interface requires root privileges, the binary could be installed setuid root. Thus, any time it started, it would be running as root. From there, it could do what it needed, then drop privileges back to a regular user.

The program would just run as a child process… when there was traffic received from the kernel, it would just spit that out to stdout. If my parent application had something to send, it would feed that into stdin.

6lhagent is an implementation of that idea. It’s pretty rough, but it seems to work. It uses a simple protocol to frame the Ethernet packets so that it can maintain synchronisation with the parent process. All frames are ACKed or NAKed, depending on whether they were understood or not. The protocol is analogous to KISS or SLIP in concept. The framing is very different to these protocols, but the concept is that of frames delimited by a byte sequence, with occurrences of the special byte sequences replaced with place-holders to prevent the parser getting confused.

I then wrote this Python script which uses the asyncio IO loop to run 6lhagent and dump the packets it receives:

$ python3 demo/dumper.py 
Interface data: b'V\xc7\x05\\yA\x05\x00\x00\x00\x00\xca\x04tap0'
Interface: MAC=[86, 199, 5, 92, 121, 65] MTU=1280 IDX=202 NAME=tap0
Ethernet traffic: b'33330000001656c7055c794186dd600000000024000100000000000000000000000000000000ff0200000000000000000000000000163a000502000001008f00f5ec0000000104000000ff0200000000000000000001ff5c7941'
From: 33:33:00:00:00:16
To:   56:c7:05:5c:79:41
Protocol: 86dd
IPv6: Priority 0, Flow 000000
From: ::
To:   ff02::16
Length: 36, Next header: 0, Hop Limit: 1
Payload: b':\x00\x05\x02\x00\x00\x01\x00\x8f\x00\xf5\xec\x00\x00\x00\x01\x04\x00\x00\x00\xff\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xff\\yA'
Ethernet traffic: b'33330000001656c7055c794186dd600000000024000100000000000000000000000000000000ff0200000000000000000000000000163a000502000001008f00f5ec0000000104000000ff0200000000000000000001ff5c7941'
From: 33:33:00:00:00:16
To:   56:c7:05:5c:79:41
Protocol: 86dd
IPv6: Priority 0, Flow 000000
From: ::
To:   ff02::16
Length: 36, Next header: 0, Hop Limit: 1
Payload: b':\x00\x05\x02\x00\x00\x01\x00\x8f\x00\xf5\xec\x00\x00\x00\x01\x04\x00\x00\x00\xff\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xff\\yA'
Ethernet traffic: b'3333ff5c794156c7055c794186dd6000000000203aff00000000000000000000000000000000ff0200000000000000000001ff5c79418700bebb00000000fe8000000000000054c705fffe5c79410e01a02d5c9a6698'
From: 33:33:ff:5c:79:41
To:   56:c7:05:5c:79:41
Protocol: 86dd
IPv6: Priority 0, Flow 000000
From: ::
To:   ff02::1:ff5c:7941
Length: 32, Next header: 58, Hop Limit: 255
ICMP Type 135, Code 0, Checksum bebb
Data: b'\x00\x00\x00\x00\xfe\x80\x00\x00'
Payload: b'\x00\x00\x00\x00T\xc7\x05\xff\xfe\\yA\x0e\x01\xa0-\\\x9af\x98'
Ethernet traffic: b'33330000001656c7055c794186dd6000000000240001fe8000000000000054c705fffe5c7941ff0200000000000000000000000000163a000502000001008f0025070000000104000000ff0200000000000000000001ff5c7941'
From: 33:33:00:00:00:16
To:   56:c7:05:5c:79:41
Protocol: 86dd
IPv6: Priority 0, Flow 000000
From: fe80::54c7:5ff:fe5c:7941
To:   ff02::16
Length: 36, Next header: 0, Hop Limit: 1
Payload: b':\x00\x05\x02\x00\x00\x01\x00\x8f\x00%\x07\x00\x00\x00\x01\x04\x00\x00\x00\xff\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xff\\yA'
Ethernet traffic: b'33330000001656c7055c794186dd6000000000240001fe8000000000000054c705fffe5c7941ff0200000000000000000000000000163a000502000001008f009cab0000000104000000ff0200000000000000000000000000fb'
From: 33:33:00:00:00:16
To:   56:c7:05:5c:79:41
Protocol: 86dd
IPv6: Priority 0, Flow 000000
From: fe80::54c7:5ff:fe5c:7941
To:   ff02::16
Length: 36, Next header: 0, Hop Limit: 1
Payload: b':\x00\x05\x02\x00\x00\x01\x00\x8f\x00\x9c\xab\x00\x00\x00\x01\x04\x00\x00\x00\xff\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xfb'

The thinking is that the bulk of the proof-of-concept will be done in Python. My reasoning for this is that it’s usually easier to prototype in a higher-level language than in C, and in this application, speed is not important. At best our network interface will be running at 9600 baud — Python will keep up just fine. Most of it will be at 1200 baud.

The Python code will do some packet filtering (e.g. filtering out the multicast NS messages, which are a no-no in RFC-6775) and to add options where required. It’ll also be responsible for rate-limiting the firehose-like output of the tap interface from the host so the AX.25 network doesn’t get flooded.

The proof of concept is coming together. Next steps are to implement an IPv6 stack of sorts in Python to dissect the datagrams.

6LoWHAM: IP Addressing

For 6LoWHAM, it could work that we just use the link-local address space to directly communicate between stations and leave it at that.

If I want to send a message to VK4BWI-5 from my station VK4MSL-9, I could just fire off a packet to fe80::6894:49ff:feae:7318 directed to my 6LoWHAM interface and be done with it. This then requires one of two things:

  1. that VK4BWI-5 can directly communicate with me
  2. that the intermediate stations know to forward my message on to that station

(1) is easy enough. (2) raises the question of “what is local”?

Supposing that this protocol took off, and suddenly the WIA decides to earmark special frequencies on a few bands for 6LoWHAM, with a fairly complete network stretching up the eastern seaboard of Australia. If my station sends a router solicitation from my home QTH in Brisbane, does someone in Melbourne really care to hear it? I’d wager this is a recipe for a very clogged packet network!

In Thread, the “link local” scope only gets you as far as the nodes that can directly hear you. It does mean that protocols like mDNS, which rely on the “link-local” multicast scope aren’t going to reach all nodes, but it also means that far flung nodes don’t need to listen to all the low-level chatter. For communications between nodes, an “on-mesh” prefix is used, and for mesh-wide multicast, a “realm-local” prefix of ff03::/64 is defined.

In truth, it’s highly unlikely that we’d have “one” single network. More likely it’ll be a mesh of interconnected networks with trunk links going via some other band (or perhaps VPNs over the Internet). For that to work, we can’t rely on just link-local networking, we actually need a routable network address for the mesh.

The Thread “mesh local” prefix is actually defined by the network’s extended IEEE-802.15.4 PAN ID, which is a 64-bit number that you define when setting up the network. Thread simply takes the most significant 40 bits of this, slaps fd in front and pads it out with zeros to 64-bits. The PAN ID 0x0123456789abcdef forms the subnet fd01:2345:6789::/64. This can be seen in the OpenThread sources.

This wastes 16-bits of address space normally reserved for the ULA subnet ID and throws away 24-bits of the PAN ID. For our network, we don’t need 16-bits worth of subnets, we just need one. We also don’t have a PAN ID in AX.25.

The thinking is, we’ll use a “group” address. This will be a regular AX.25 SSID, which will translate to a MAC which has the group bit set. (Exactly how I’ll differentiate between a station SSID and a group SSID I’m not sure. Probably will look at the destination IP, if it’s multicast then the group bit gets set.)

Supposing we were to use this for the International Rally of Queensland (an event which is now defunct), we might create a 6LoWHAM network with a group address of “IROQ19”. The MAC address used for group-wide communications would be 03:01:cd:e5:a9:f8.

We can derive a prefix from this MAC address. A ULA normally consists of a 7-bit ULA prefix, a 1-bit “global/local” bit, a 40-bit global ID, and a 16-bit subnet ID.

The ULA prefix is fc::/7. The global/local bit is always set to 1 (local) because no one has come up with a way that ULAs can be globally administered. 40 bits is a bit tight, we could truncate our MAC to 40 bits and ignore the subnet ID like Thread do, that gives us a subnet of fd03:1cd:5ea9::/64.

The last 3 bits of the SSID though, are like a subnet ID. So if we move those 3 bits to set the last 3 bits of the prefix, we can make some use of that subnet ID, but still waste 13 bits with zeros.

Alternatively, we can consider the global ID and subnet ID to be one 56-bit field. We effectively shrink the subnet ID to 3 bits. That gives us a 53-bit global ID, which now fits the remaining 45-bits of our MAC and leaves us with 8 bits left over.

We can discard the lowest two bits in the first byte of the MAC as those (the group and local bits) will be the same for all groups, so that gives us another two bits. 10 bits isn’t a lot, but it’s enough to encode “AR” (amateur radio) in ITA-2, thus giving us a recognisable subnet mask for all 6LoWHAM networks. We wind up with the following:

┌─ULA─┐L┌──"AR"──┐┌───────────── Network Address ──────────────┐
1111110100010010100000000000000111001101111001011010100111111000
└──┼───┼───┼───┼───┼───┼───┼───┼───┼───┼───┼───┼───┼───┼───┼───┤
   f   d   1   2 : 8   0   0   1 : c   d   e   5 : a   9   f   8 /64

This actually has me thinking whether the call-sign part of the SSID should be right-padded out to make the network address consistent. Maybe my SSID to MAC algorithm could do with a tweak there as it may make routing easier as it’ll put all those zeros to the right.

In Thread, the mesh-local prefix isn’t route-able beyond the mesh, there’s a separate prefix handed out by border routers for that. In our case, I don’t think there’s any point in complicating matters by having more than one route-able prefix for a mesh. If a station participates in two networks that share a frequency, then sure, that node may have an address on each network, but each network should share a common identity.

Thus in the contrived example of having a large network along the coastline: it’d be an “inter-network” of smaller meshes, linked together via router nodes which know how to hop between them. Those routes may be via point-to-point microwave links, HF, Internet tunnels, etc.

The subnets used for these other networks may be assigned a “context identifier” which is 4-bits. I’ll have to figure out if there’s a sane way to do that on a given network. Most 802.15.4 networks have a “PAN co-ordinator” which could be looking after that. Thread networks elect a “leader” node.

Given the small number of identifiers, and the low probability of this being used, this should be manually administered. Even without a context ID being assigned, one can still route between the subnets, just that the full IPv6 address needs to be given for the foreign node, so you incur a 16-byte penalty doing so. Thus the context IDs will probably be handed out for “popular routes”, with the mesh prefix being “context 0”.

I haven’t yet given thought to how this “context” would be disseminated over the mesh or kept updated. That is a can of worms for another day.

6LoWHAM: Exploring the TUN/TAP interface

One of the aims of 6LoWHAM was to provide a means to send IPv6 traffic between user applications and the AX.25 network.

In order to do this, the applications have to have some way of injecting their IP traffic. The canonical way this is done is through the operating system’s TCP/IP stack. This requires that we have an interface to the operating system kernel in order to receive that IP traffic destined for the airwaves.

Now, we could write a kernel driver for this, but it’s going the long way around to do it. Especially as we intend to interface to software that runs in userspace for the actual transmission. Our driver at best would be just taking the raw Ethernet frame, extracting the IP part, and forwarding that back to our program running in userspace.

There’s a driver that does that for us: TUN/TAP. This driver can either create a TUNnel device, which forwards IP datagrams, or a TAP device, which forwards Ethernet frames. We’ll focus on the TUN mode of this driver here.

The idea is this will create an IP tunnel, with one side exposing a network device to the kernel, and the other side being a file descriptor in a userspace application that just reads and writes raw IP frames. How it generates and processes those frames is entirely up to the software author. Most famous uses for this device are VPNs, so taking the IP datagram, encrypting it, then encapsulating it in an IP datagram (usually UDP) to be sent over the Internet to some other peer, which reverses the process and writes the original packet to its tunnel file descriptor.

In our case, we’ll be dissecting it a bit to extract the key fields, then applying our own “compression” defined in the 6LoWHAM specs, then forwarding it on to our AX.25 stack (probably LinBPQ or Direwolf) to be sent as an AX.25 UI frame.

The first step in this journey was actually figuring out what the packets look like on a tunnel device. I created this little program to explore the idea.

It just needs the usual C toolchain and libraries on a Linux system. I tested with Gentoo and Linux kernel 4.15. Building it is a simple make command. If you then run the resulting binary as root, you’ll find a tun0 device (or maybe some other number) created.

Bring the interface up, and you should start to see some traffic as the host tries to talk to is new (and very much mute) peer:

RC=0 stuartl@rikishi ~/projects/6lowham/packetdumper $ make 
cc    -c -o linuxtun.o linuxtun.c
cc    -c -o main.o main.c
cc -o packetdumper linuxtun.o main.o
RC=0 stuartl@rikishi ~/projects/6lowham/packetdumper $ sudo ./packetdumper 
Password: 
^Z
[1]+  Stopped(SIGTSTP)        sudo ./packetdumper
RC=148 stuartl@rikishi ~/projects/6lowham/packetdumper $ sudo ip link set dev tun0 up
RC=0 stuartl@rikishi ~/projects/6lowham/packetdumper $ fg
sudo ./packetdumper
Flags: 0x0000  Protocol: 0x86dd
  48:  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
   0: 60 00 00 00 00 08 3a ff fe 80 00 00 00 00 00 00
  16: 5e be 89 41 7b 19 d5 60 ff 02 00 00 00 00 00 00
  32: 00 00 00 00 00 00 00 02 85 00 44 bd 00 00 00 00
Flags: 0x0000  Protocol: 0x86dd
  48:  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
   0: 60 00 00 00 00 08 3a ff fe 80 00 00 00 00 00 00
  16: 5e be 89 41 7b 19 d5 60 ff 02 00 00 00 00 00 00
  32: 00 00 00 00 00 00 00 02 85 00 44 bd 00 00 00 00

I didn’t bother to decode the IP datagram further, but if you look at the Wikipedia IPv6 Packet article, it isn’t difficult to see what’s going on. In this case, we can see it’s an IPv6 packet both from the Protocol field (0x86dd is the Ethertype for IPv6), and from the first 4 bits of the frame payload.

The traffic class and flow label are both 0s here. The IPv6 payload length is just 8 bytes, so most of this is in fact IPv6 header data. Next header is type 0x3a (IPv6 ICMP) and the hop limit is 255. This is followed by the source address (my laptop’s link-local address fe80::5ebe:8941:7b19:d560) and the destination address (all link-local routers multicast address ff02::2).

The ICMPv6 message is the last 8 bytes; and in this case, it’s type is 0x85 (router solicitation), the code is 0x00, the two bytes after that are the checksum and the message (4 bytes) is all zeros.

Quite how that address was chosen is something I’ll have to get to grips with. Yes, it’s SLAAC, but where did it get the hardware address from? That I’ll have to figure out.

The alternative is to use a TAP interface, which means I choose the MAC address, and thus can control what the SLAAC-derived address becomes. Ohh, and it goes without saying that the privacy extensions will be a big no no on the air: we’re relying on the fact that we can derive the IPv6 address from the SSID of the station both for technical reasons and to legally meet the requirements for stations to “identify” who they are and whom they are talking to. SLAAC privacy will make a mess of that.

So controlling this link-local address is a must. I guess next stop: let’s look at a tap device. I’ve just made some changes to explore the differences from the application end. There isn’t a lot of difference here.

RC=130 stuartl@rikishi ~/projects/6lowham/packetdumper $ sudo ./packetdumper -tap
Password: 
^Z
[1]+  Stopped(SIGTSTP)        sudo ./packetdumper -tap
RC=148 stuartl@rikishi ~/projects/6lowham/packetdumper $ sudo ip link set tap0 up
RC=0 stuartl@rikishi ~/projects/6lowham/packetdumper $ fg
sudo ./packetdumper -tap
Flags: 0x0000  Protocol: 0x86dd
  90:  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
   0: 33 33 00 00 00 16 ce 65 0c 34 48 34 86 dd 60 00
  16: 00 00 00 24 00 01 00 00 00 00 00 00 00 00 00 00
  32: 00 00 00 00 00 00 ff 02 00 00 00 00 00 00 00 00
  48: 00 00 00 00 00 16 3a 00 05 02 00 00 01 00 8f 00
  64: 27 22 00 00 00 01 04 00 00 00 ff 02 00 00 00 00
  80: 00 00 00 00 00 01 ff 34 48 34
Flags: 0x0000  Protocol: 0x86dd
  86:  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
   0: 33 33 ff 34 48 34 ce 65 0c 34 48 34 86 dd 60 00
  16: 00 00 00 20 3a ff 00 00 00 00 00 00 00 00 00 00
  32: 00 00 00 00 00 00 ff 02 00 00 00 00 00 00 00 00
  48: 00 01 ff 34 48 34 87 00 af 03 00 00 00 00 fe 80
  64: 00 00 00 00 00 00 cc 65 0c ff fe 34 48 34 0e 01
  80: 61 78 48 c1 ac aa

The big difference is now we have an Ethernet header prepended. The proto field in the packet information now duplicates what we can see in the Ethernet frame header (bytes 12 and 13), and the IPv6 packet starts from byte 14.

I think this is the mode 6LoWHAM will use. It’s possible to set the MAC address on the created tap0 device to whatever 46 bits we like, the remaining two bits in the MAC address are for defining whether the address is global or local (we’ll set ours to “local”), and the other sets whether this is a multicast or unicast address. The SLAAC address will closely match this address with two differences:

  1. The MAC will have the bytes 0xff 0xfe inserted into the middle.
  2. The “global/local” bit is inverted. So for the 2001:db8::/64 prefix:
    • aa:bb:cc:dd:ee:ff becomes 2001:db8::a8bb:ccff:fedd:eeff
    • a8:bb:cc:dd:ee:ff becomes 2001:db8::aabb:ccff:fedd:eeff

That latter point had me confused at first, I thought it might’ve been that a bit got cleared, but instead it’s just inverted, so completely reversible.

6LoWHAM: Route discovery thoughts

Thinking about the routing problem a little more… if I wanted to do a purely “native” routing scheme not involving Net/ROM routing update broadcasts, one has to wonder what such a system would look like.

Net/ROM L3 is really just intended to “bootstrap” things… there’s the prospect of using Net/ROM L4 for tunnelling TCP traffic, but really it’s the L3 part that interests me as a way of hopping between fragments of the mesh that may be linkable via a non-6LoWHAM capable digipeater.

Net/ROM’s periodic broadcasts are inefficient, divulging a node’s entire routing table is not an ideal situation.  So what’s the alternative?  IPv6 nodes already send a “neighbour discovery” packet when they don’t know the MAC address of a neighbour, this is a trigger for a “neighbour advertisement” response.

I’m thinking 6LoWHAM will send NAs periodically anyway.  ACMA rules require identifying every 10 minutes.  Since the NA will include the call-sign of the station (in bit-shifted ASCII), doing that every 10 minutes takes care of the ACMA requirement.  An IPv6 NA message is not a big payload.

Given this will be sent to the ff02::1 multicast group, all nodes able to hear the beaconing station will receive it.  Unlike a IEEE 802.11 or 802.3 network though, not all nodes on the mesh will hear it.

The same is true of ND messages.  If the neighbour is in ear-shot and able to respond, it likely will, but that isn’t a guarantee.  Something in the link-local scope will likely be the answer, probably a daemon listening on a UDP port and sending to the ff02::1 group.

Unicast routing

When a station wishes to make contact with a station that’s not an immediate neighbour, I’m thinking of a broadcast similar to how APRS does things.  APRS uses special call-signs WIDEn-m, where the hop-limit is encoded in those messages.

A UDP message would be constructed asking “Who can reach X within N hops?” and sent to ff02::1 to some “well-known” port.

The first second is reserved for responses from nodes that know a route, either through Net/ROM, or maybe they’ve been in contact with that station before.  They respond something along the lines of “X via A,B,C, quality Q”, where A, B, C are digipeaters and Q is some link quality value.

Not sure how I’ll derive Q just yet.  Possibly based on packet loss… we’ll think of something.

If no responses are heard, the routers that heard the message re-broadcast it and listen for replies.  In the re-broadcast, each router appends its 48-bit 6LoWHAM address and a link quality to the message payload.  The hop limit would also get decremented.  That way, it can break cycles, and it gives a direct unicast path for the distant node to respond.

The same algorithm applies: wait a second for immediate responses, then any routers downstream append their addresses/link quality values, decrement the hop limit, and re-broadcast.

Again, any node that overhears the message (including the target node), may respond.  It does so via a direct unicast, sent using conventional AX.25 digipeating.  Any router en route that relays the message may also cache the result.  The “mesh” gets to learn of where everyone is as-required rather than by default with Net/ROM.

If the hop limit reaches zero, no further re-broadcasts are made, the message stops there.

When the source node hears the replies, each reply resets a 100msec timer.  100msec after the last reply, it chooses three “best” routes, and sends a ICMPv6 ND message via each one to the target station.  The station replies to all three back via those routes with an ICMPv6 NA.  If a message is lost via one of those routes, that route is demoted in quality.

Once replies have arrived back at the source, it picks the best route based on the updated quality information, and begins communications via that route.

Multicast routing

This, is more tricky.  I think the link-local should mean what it means on Thread… that is ff02::/16 just gets processed by immediate neighbours that are in direct RF range.

Realm-local (RFC-7346), ff03::/16 should be used for stuff that’s mesh-wide.  Those messages may be repeated by routers provided those routers have at least one subscriber for the given multicast group/port listening.

Multicast Listener Discovery looks to be the tool for that, although it could do with some 6LoWPAN-style optimisation.

I’m thinking the first time a router hears a datagram destined for a particular group, it should send a query out asking “who is listening” to the said group.

Following that first message, it should be up to the downstream node to inform the local routers that it intends to receive messages from a given group.  This should be periodic, maybe hourly, so that routers are not re-broadcasting messages for a node that has gone off-air.

Routers that have no listeners for a group, do not rebroadcast that group’s traffic.  Similarly, if the hop limit has been exhausted, the messages do not get rebroadcast.

6LoWHAM: Digging into LinBPQ

So today I was meant to be helping re-build a deck, but that got postponed to next weekend.  Thus, I had an extra free day I wasn’t counting on.

I wound up looking at LinBPQ in detail, to see if I can get it to run.  I downloaded the sources, and sure enough, they do compile on my x86-64 laptop, but does it work?  Not a chance.  Starts parsing the configuration file, then boompa, SEGFAULT.

I run the binary through gdb, and see this:

GNU gdb (Gentoo 8.1 p1) 8.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/stuartl/projects/6lowham/linbpq/linbpq...done.
(gdb) r
Starting program: /home/stuartl/projects/6lowham/linbpq/linbpq 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
G8BPQ AX25 Packet Switch System Version 6.0.17.1 November 2018
Copyright � 2001-2018 John Wiseman G8BPQ
Current Directory is /var/lib/linbpq

Configuration file Preprocessor.
Using Configuration file /var/lib/linbpq/bpq32.cfg
Conversion (probably) successful


Program received signal SIGSEGV, Segmentation fault.
0x00005555555f8a7b in Start () at cMain.c:1190
1190                    *(ptr3++) = *(ptr2++);
(gdb) bt full
#0  0x00005555555f8a7b in Start () at cMain.c:1190
        cfg = 0x555555b91c40
        ptr1 = 0x555555ba60c0
        PORT = 0x5555558f6aa0 
        FULLPORT = 0x558f7928
        NEXTPORT = 0x5555558f6de0 <DATAAREA+832>
        EXTPORT = 0x7ffff6eb7953 <_IO_file_overflow+291>
        APPL = 0x5555558f49e0 
        ROUTE = 0x559085e8
        DEST = 0x870b07e2ddd5f300
        CMD = 0x5555558d79e0 
        PortSlot = 2
        ptr2 = 0x555555ba6849 "K4MSL Test station \r"
        ptr3 = 0x55912549 
        ptr4 = 0x5555558d7183 <COMMANDS+1667> "         \003"
        CWPTR = 0x5555558f6b18 <DATAAREA+120>
        i = 0
        n = 119
        int3 = 1435466024
#1  0x000055555563e35c in main (argc=1, argv=0x7fffffffe518) at LinBPQ.c:598
        i = 1
        user = 0x0
        conn = 0x7ffff7ffa298
        STAT = {st_dev = 140737354131120, st_ino = 140737488347784, st_nlink = 140737488347780, st_mode = 4160741648, 
          st_uid = 32767, st_gid = 4143745959, __pad0 = 32767, st_rdev = 140737488348192, st_size = 140737488347784, 
          st_blksize = 1700966438, st_blocks = 26577600, st_atim = {tv_sec = 140737354113688, tv_nsec = 140737488348000}, 
          st_mtim = {tv_sec = 140737354113448, tv_nsec = 140737488347780}, st_ctim = {tv_sec = 140737488347984, 
            tv_nsec = 140737354131160}, __glibc_reserved = {1, 4150715120, 0}}
        PORTVEC = 0x7ffff7ffe6b0

Ookay then… so invalid pointers, what fun!  More to the point, have a close look at the underlined addresses… I’m beginning to understand why it was called BPQ32.

The culprit for this wound up being little gems like this:

			//	Round to word boundary (for ARM5 etc)

			int3 = (int)ptr3;
			int3 += 3;
			int3 &= 0xfffffffc;
			ptr3 = (UCHAR *)int3;

There were a few other instances of this, and variations on the theme too, but one way or the other, linbpq basically assumes that all pointers are 32-bits, and so are ints.

Four hours later, I finally had something that started, but there are probably lots of landmines for anyone running the binary to inadvertently stomp on.  The code is pointer-arithmetic city!  Much of the time, code is casting pointers to unsigned int, or back again.  If I submitted code like that at work, they’d have me hauled ’round the back of the building and shot!

I’m left wondering if it’s worth getting to understand, or should I shove it in a VM, write some code based on my understanding of the protocols, do some integration testing with it, then abandon LinBPQ for something I can have confidence in.

The use and re-use of certain variables makes me wonder if the code is actually a port from the DOS-based BPQCode which was likely written in 8086 assembler.  This would make a lot of sense as to why I’m seeing the sorts of software coding patterns I’m seeing in that code.  The logic seems to have been ported to C just enough to get it to compile and work like the assembly version.

Reasonable enough… but there’s a lot of technical debt there still waiting to be paid back.  On paper, there’s a lot of benefit in using LinBPQ as the back-end, and I am thankful that John Wiseman made the decision to release the code under the GPLv3 so that I can at least investigate the possibility of using that code here.

I’ve thrown what I’ve got up on Github for now, and there’s a Gentoo overlay for installing it.  Add the overlay and run emerge linbpq, and you should find yourself with an installation of LinBPQ that just needs some OpenRC scripts and some work with an editor on /var/lib/linbpq/bpq32.cfg to get going.

If I get further on the code front, I might look at some init scripts, both OpenRC and systemd ones, then I can produce a few Debian binaries so you can run apt-get install linbpq on your Raspberry Pi and have a packet station going quickly.