Projects

Solar Cluster: Bootstrapping Gentoo

Well, in my last post I discussed getting OpenADK to build a dev environment on the TS-7670.  I had gotten Gentoo’s Portage installed, and started building packages.

The original plan was to build everything into /tmp/seed, but that requires that all the dependencies are present in the chroot.  They aren’t.  In the end, I decided to go the ill-advised route of compiling Gentoo over the top of OpenADK.

This is an ugly way to do things, but it so far is bearing fruit.  Initially there were some hiccups, and I had to restore some binaries from my OpenADK build tree.  When Gentoo installed python-exec; that broke Portage and I found I had to unpack a Python 2.7 binary I had built earlier then use that to re-install Portage.  I could then continue.

Right now, it’s grinding away at gcc; which was my nemesis from the beginning.  This time though, it successfully built xgcc and xg++; which means it has compiled itself using the OpenADK-supplied gcc; and now is building itself using its self-built binaries.  I think it does two or three passes at this.

If it gets through this, there’s about 65 packages to go after that.  Mostly small ones.  I should be able to do a ROOT=/tmp/seed emerge -ek @system then tar up /tmp/seed and emerge catalyst.  I have some wrapper scripts around Catalyst that I developed back when I was responsible for doing the MIPS stages.  These have been tweaked to do musl builds, and were used to produce these x86 stages.  The same will work for ARMv5.

It might be another week of grinding away, but we should get there. 🙂

Solar Cluster: arm-unknown-linux-musleabi… saga part III

So, after a longish wait… my laptop finally coughed up an image with a C/C++ compiler and almost all the bits necessary to make Gentoo Portage tick.

Almost everything… wget built, but it segfaults on start-up.  No matter, it seems curl works.  We do have an issue though: Portage no longer supports customising the downloader like it used to, or at least I couldn’t see how to do it, it used to be settings in make.conf.

Thankfully, I know shell scripts, and can make my own wget using the working curl:

bash-4.4# cat > /usr/bin/wget
#!/bin/bash

OUT=
while [ $# -gt 0 ]; do
    case "$1" in
        -O) OUT="$2"; shift;;
        -t) shift;;
        -T) shift;;
        --passive-ftp) : ;;
        *) break ;;
    esac
    shift
done

set -ex
curl --progress-bar -o "${OUT}" "$1"

Okay, it’s a little (a lot) braindead, but it beats downloading the lot by hand!

I was able to get Gentoo installed by hand using these instructions.  I have an old 1TB HDD plugged into a USB dock, formatted with a 10GB swap partition and the rest btrfs.  Sure, it’s only USB 2.0, but I’d sooner just put up with some CPU overhead than wear out my eMMC.

Next step; ROOT=/tmp/seed emerge -ev system

Solar Cluster: arm-unknown-linux-musleabi… a saga

So, fun and games with the TS-7670.

At present, I have it up and running:

root@ts7670:~# uname -a
Linux ts7670 4.14.15-vrt-ts7670-00031-g1a006273f907-dirty #2 Sun Jan 28 20:21:08 EST 2018 armv5tejl GNU/Linux

That’s booted up into Debian Stretch right now.  debootstrap did its deed a few days ago on the eMMC, and I was able to boot up this new image.  Today I built a new kernel, and tweaked U-Boot to boot from eMMC.

Thus now the unit can boot without any MicroSD cards fitted.

There’s a lot of bit rot to address.  U-Boot was forked from some time in 2014.  I had a crack at rebasing the code onto current U-Boot, but there’s a lot of clean-up work to do just to get it to compile.  Even the kernel needed some fixes to get the newer devicetree sources to build.

As for getting Gentoo working… I have a cross-compiling toolchain that works.  With it, I’ve been able to compile about 99% of a seed stage needed for catalyst.  The 1% that eludes me, is GCC (compiled to run on ARMv5).  GCC 4.9.4 will try to build, but fails near the end… anything newer will barf complaining that my C++ compiler is not working.  Utter bollocks, both AMD64 and ARM toolchains have working C++ compilers, just it’s looking for a binary called “g++” rather than being specific about which one.  I suspect it wants the AMD64 g++, but then if I symlink that to /usr/bin/g++, it throws in ARM CFLAGS, and AMD64 g++ barfs on those.

I’ve explored other options.  I can compile GCC by hand without C++ support, and this works, but you can’t build modern GCC without a C++ compiler … and people wonder why I don’t like C++ on embedded!

buildroot was my next thought, but as it happens, they’ve stripped out the ability to compile a native GCC on the target.

crosstool-ng is the next logical choice, but I’ll have to fiddle with settings to get the compiler to build.

I’ve also had OpenADK suggested, which may be worth a look.  Other options are OpenEmbedded/Yocto, and Cross Linux from Scratch.  I think for the latter, cross is what I’ll get, this stuff can be infuriatingly difficult.

Solar Cluster: TS-7670 showed up today

So, I now have my little battery monitoring computer.  Shipping wound up being a little more than I was expecting… about US$80… but never mind.  It’s here, arrived safely:

HTLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLFLC
>> TS-BOOTROM - built Jan 26 2017 12:29:21
>> Copyright (c) 2013, Technologic Systems
LLCLLLLLLLFLCLLJUncompressing Linux... done, booting the kernel.
/ts/fastboot file present.  Booting to initramfs instead
Booted from eMMC in 3.15s
Initramfs Web Interface: http://ts7670-498476.local
Total RAM: 128MB
# exit
INIT: version 2.88 booting
[info] Using makefile-style concurrent boot in runlevel S.
[ ok ] Starting the hotplug events dispatcher: udevd.
[ ok ] Synthesizing the initial hotplug events...done.
[ ok ] Waiting for /dev to be fully populated...done.
[ ok ] Activating swap...done.
[....] Checking root file system...fsck from util-linux 2.20.1
e2fsck 1.42.5 (29-Jul-2012)
/dev/mmcblk2p2: clean, 48540/117600 files, 282972/469760 blocks
done.
[ ok ] Cleaning up temporary files... /tmp /lib/init/rw.
…
ts7670-498476 login: root
Linux ts7670-498476 2.6.35.3-571-gcca29a0+ #1 PREEMPT Mon Nov 27 11:05:10 PST 2017 armv5tejl
TS Root Image 2017-11-27

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.

root@ts7670-498476:~# 

The on-board 2GB eMMC has a version of Debian Wheezy on it.  That’ll be going very soon.  For now, all I’ve done is pop the cover, shove a 8GB MicroSD card into one of the on-board slots, wired up a 12V power brick temporarily to the unit, hooked a USB cable into the console port (/dev/ttyAMA0 is wired up to an on-board CP2103 USB-serial chip) and verified that it is alive.

Next step will be to bootstrap Gentoo.  I could use standard ARMv5 stages, or I can build my own, which I might do.  I’ve done this before for mips64el n64 using glibc.  Modern glibc is a goliath on a machine with 128MB RAM though, so I’ll be looking at either µClibc/µClibc-ng or musl… most likely the latter.

That said, 20 years ago, we had the same computing power in a desktop. 🙂

I have a few options for interfacing to the power meters…

  • I²C, SPI, a number of GPIOs and a spare UART on a 2.54mm header inside the case.
  • Another spare UART on the footprint for the GPS module (which my unit does not have)
  • Two RS-232 serial ports with RTS/CTS control lines, exposed via RJ-45 jacks
  • Two CANbus ports on a single RJ-45 jack
  • RS-485 on a port marked “Modbus”

In theory, I could just skip the LPC810s and hook this up directly to the INA219Bs.  I’d have to double check what the TTL voltage is… Freescale love their 1.8V logic… but shifting that up to 3.3V or 5V is not hard.  The run is a little longer than I’m comfortable running I²C though.

The LPC810s don’t feature CANbus, so I think my original plan of doing Modbus is going to be the winner.  I can either do a single-ended UART using a resistor/diode in parallel to link RX and TX to the one UART line, or use RS-485.

I’m leaning towards the latter, if I decide to buy a little mains energy meter to monitor power, I can use the same RS-485 link to poll that.  I have some RS-485 transceivers coming for that.

For now though, I’ll at least get Debian Stretch going… this should not be difficult, as I’ll just use the images I’ve built for work to get things going.  I’m downloading a Jessie image now:

root@ts7670-498476:~# curl https://bne.vrt.com.au/technologicsys/ts7670d-jessie-4.4.1-20160226.dd.xz | xzcat | dd of=/dev/mmcblk0 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0  113M    0  544k    0     0   114k      0  0:16:48  0:00:04  0:16:44  116k

Once that is done, I can reboot, re-format the eMMC and get debootstrap going.  I might even publish an updated image while I’m at it.

Solar Cluster: arm-unknown-linux-musleabi… saga part II

So, last time I was trying to get Gentoo’s portage to cross-build gcc so that I’d have a C/C++ compiler in my ARMv5 musl environment.

It is literally the last piece of the puzzle.  Once compiled, that is the last step I need before I can throw the shiny new environment onto an ARMv5 VM (or real ARMv5 CPU), do an emerge -e world on it then tar the lot up and throw it at Catalyst.

Building an entire OS on a 454MHz ARMv5 machine with 128MB RAM does not faze me one bit… I used to do it regularly on a (Gateway-branded) Cobalt Qube II server appliance, which sports a 250MHz QED RM5231 and 128MB RAM.  The other compile workhorse I used in those days was an SGI O2; 300MHz RM5200, again 128MB RAM.

Yes, Linux and its userland has bulked up a bit in the last 10 years, but not so much so that a build on these is impossible.

Certainly, native building is easier than cross-compiling.  Cross-compilers have always been a voodoo art for me.  Getting one that will build a Linux kernel or U-Boot, usually isn’t too hard… but get userland involved and it gets complex.  Throw in C++ and complexity skyrockets!

I’m taking OpenADK for a spin now, and in concept, it’s exactly what I remember buildroot used to be.  It’s a tool for generating a fully fledged embedded Linux system with a wide package selection including development tools.  I also find that you have to hold your tongue just right to get stuff to compile.

Selecting a generic arm926ej-s; it succeeded to build a x86-64 hosted cross-toolchain once, but then silently refused to build anything else.  I told it instead to build for a Versatile PB with an arm926ej-s CPU… it failed to build the cross-toolchain, even though it pretty much is the exact same target.

A make cleandirs later, and it happily started building everything, but then hiccupped on permissions, so against my better judgement, I’m running it now with sudo, and things are progressing.  With some luck, I should have something that will give me a working native gcc/g++ for musl on ARMv5.

Toy Synthesizer: Second revision I/O board nearly complete

So last week, some more parts arrived for this project.  Crucially, some capacitors, and some MOSFETs.

Turns out, the MOSFETs I got last time were not I²PAK … I thought that’s what I ordered last time, but clearly not, because that’s what these are, and they’re bigger than the previous ones.

No matter, lean them back like dominoes and they still fit.  I’ve got a tube of 50 of them.

I was able to put the pull-down resistor in, and I basically fitted the MOSFETs along a track, scoring the track between the legs so they didn’t short out.  For the 12ohm resistor to the drain, I’m doing half-veroboard-PCB, half-point-to-point construction to link the drain pin (annoyingly in the centre) to the outside world.

I’ll need to rustle up a 2-pin KK to act as the power input, and that board is done.  I might add two in parallel on here so a short lead can link from here to the mainboard to supply +12V.  This will go on the right-hand side (lower photo) just past where that jumper connects.

After that, the next step is wiring up the buttons and switches.  The use of 4-pin connectors should greatly simplify the wiring since everything we need is on the one connector.

Solar Cluster: Battery monitor computer ordered

I’ve taken the plunge and gotten a TS-7670 ordered in a DIN-rail mount for monitoring the battery.  Not sure what the shipping will be from Arizona to here, but I somehow doubt I’m up for more than AU$300 for this thing.  The unit itself will cost AU$250.

Some will argue that a Raspberry Pi or BeagleBone would be cheaper, and that would be correct, however by the time you’ve added a DIN-rail mount case, an RS-485 control board and a 12V to 5V step-down power converter, you’d be around that figure anyway.  Plus, the Raspberry Pi doesn’t give you schematics.  The BeagleBone does, but is also a more sophisticated beast.

The plan is I’ll spin a version of Gentoo Linux on it… possibly using the musl C library to keep memory usage down as I’ve gone the base model with 128MB RAM.  I’ll re-spin the kernel and U-Boot patches I have for the latest release.

There will be two functions looked after:

  • Access to the IPMI/L2 management network
  • Polling of the two DC power meters (still to be fully designed) via Modbus

It can report to a VM running on one of the hosts.  I believe collectd has the necessary bits and pieces to do this.  Failing that, I’ve written code before that polls Modbus… I write such code for a day job.

Toy Synthesizer: I/O module, MkII

So some spare time today… I decide to construct a new I/O module to fix up the mistakes made with the previous iteration.  Mainly:

  • the TVS diodes… going for one with a higher clamp voltage so it doesn’t smoke when 12V is applied
  • switching to a 4-pin connector on the output side, with pins for 0V, GPIO, DRAIN and +12V
  • fixing the pin-out on the input side so it matches the PCB.
  • rather than having jumper leads to make the boards separable, we’ll make one monolithic board that plugs into all 8 channels simultaneously with one long connector.

For the TVS diodes, I ordered some TPD2E007 in SOT23… thinking those would be a reasonable size for hand-soldering.

00

Now… how the bloody hell am I going to solder these little tiddlers?  I had thought SOT-23 was about twice that size.  Never mind, can’t un-buy them.

The circuit is pretty much identical to what came before.  My MOSFETs and 4.7nF capacitors seem to have gone walkies, not sure where.  No doubt the arrival of replacements will summon them back.  I decided to use SMT for many parts on this build.

0805 resistors and veroboard aren’t a bad combo really, just have a sharp blade handy to cut the track where needed, and the resistor can straddle the gap made.

For the TVS diodes, the common pin is to ground, so I made a bus bar running vertically down the PCB and scoring the tracks either side.  The common pins could be soldered to that, and the two I/O pins would straddle the division between each track.  Aside from me getting some parts off-by-one at first, this went well.

The zener and schottky diodes of course, being through-hole, went on the other side of the PCB.

I still have to locate where my MOSFETs have gone, and I think I found some 12 ohm resistors (through-hole).  I can use some 0805 1k resistors for the MOSFET gates.  So that’s some MOSFETs and 4.7nF, probably 0805 size capacitors that need ordering in the new year.

Solar Cluster: 2 days and counting on solar…

So, I’m home now for the Christmas break… and the fan in my power supply decided it would take a Christmas break itself.

The power supply was purchased brand new in June… it still works as a power supply, but with the fan seized up, it represents an overheating risk.  Unfortunately, the only real options I have are the Xantrex charger, which cooked my last batteries, or a 12V 20A linear PSU I normally use for my radio station.  20A is just a touch light-on, given the DC-DC converter draws 25A.  It’ll be fine to provide a top-up, but I wouldn’t want to use it for charging up flat batteries.

Now, I can replace the faulty fan.  However, that PSU is under warranty still, so I figure, back it goes!

In the meantime, an experiment.  What happens if I just turn the mains off and rely on the batteries?  Well, so far, so good.  Saturday afternoon, the batteries were fully charged, I unplugged the mains supply.  Battery voltage around 13.8V.

Sunday morning, battery was down to 12.1V, with about 1A coming in off the panels around 7AM (so 6A being drained from batteries by the cluster).

By 10AM, the solar panels were in full swing, and a good 15A was being pumped in, with the cluster drawing no more than 8A.  The batteries finished the day around 13.1V.

This morning, batteries were slightly lower at 11.9V.   Just checking now, I’m seeing over 16A flowing in from the panels, and the battery is at 13.2V.

I’m in the process of building some power meters based on NXP LPC810s and TI INA219Bs.  I’m at two minds what to use to poll them, whether I use a Raspberry Pi I have spare and buy a case, PSU and some sort of serial interface for it… or whether I purchase a small industrial PC for the job.

The Technologic Systems TS-7670 is one that I am considering, given they’ll work over a wide range of voltages and temperatures, they have plenty of UARTs including RS-485 and RS-232, and while they ship with an old Linux kernel, yours truly has ported both U-Boot and the mainline Linux kernel.  Yes, it’s ARMv5, but it doesn’t need to be a speed demon to capture lots of data, and they work just fine for Barangaroo where they poll Modbus (via pymodbus) and M-bus (via python-mbus).

Solar Cluster: HA VM experiment using plain libvirt: no go

So, I have two compute nodes.  I’ll soon have 32GB RAM in each one, currently one has 32GB and the other has its original 8GB… with 5 8GB modules on the way.

I’ve tested these, and they work fine in the nodes I have, they’ll even work along side the Kingston modules I already have, so one storage node will have a mixture.  That RAM is expected to arrive on Monday.

Now, it’d be nice to have HA set up so that I can power down the still-to-be-upgraded compute node, and have everything automatically fire up on the other compute node.  OpenNebula supports this. BUT I have two instances that are being managed outside of OpenNebula that I need to handle: one being the core router, the other being OpenNebula itself.

My plan was to use corosync.  I have an identical libvirt config for both VMs, allowing me to move the VMs manually between the hosts.  VM Disk storage is using RBDs on Ceph.  Thus, HA by default.

As an experiment, I thought, what would happen if I fired up two instances of the VM that pointed to the same RBD image?  I was expecting one of two things to happen:

  • The image would be locked by the first started image, locking out the second.  One instance would boot, the other would fail to boot.
  • Both instances would boot… the split-brain scenario.

So, I created a libvirt domain on one node, slapped Ubuntu on there (I just wanted a basic OS for testing, so command line, nothing fancy).  As that was installing, I dumped out the “XML config” and imported that to the second node, but didn’t start it yet.

Once I had the new VM booted on node 1, I booted it on node 2.

To my horror, it started booting, and booted straight to a log-in prompt. Great, I had manually re-created the split-brain scenario I specifically hoped to avoid.  Thankfully, it is a throw-away VM specifically for testing this behaviour.  To be sure, I logged in on both, then hard-resetted one.  It boots to GRUB, then immediately GRUB goes into panic mode.  I hard reset the other VM, it boots past GRUB, but then systemd goes into panic mode.  This is expected: the two VMs are stomping on each others’ data oblivious to each others’ existence, a recipe for disaster.

So for this to work, I’m going to have to work on my fencing.  I need to ensure beyond all possible doubt, that the VM is running in one place and one place ONLY.

libvirt supports VM hooks to do this, and there’s an example here, however this thread seems to suggest this is not a reliable way of doing things.  RBD locking is what I hoped libvirt would do implicitly, but it seems not, and it appears that the locks are not removed when a client dies, which could lead to other problems.

A distributed lock manager would handle this, and this is something I need to research.  Possibilities include HashiCorp Consul, Apache ZooKeeper, CoreOS etcd and Redis, among others.  I can also try to come up with my own, perhaps built on PAXOS or Raft.

The state needs to only be kept in memory, persistence on disk is not required.  It’s safe to assume that if the cluster doesn’t know about a VM, it isn’t running anywhere else.  Once told of that VMs existence though, it should ensure only one instance runs at a time.

If a node loses contact with the remaining group, it should terminate everything it has, as it’s a fair bet, the others have noticed its absence and have re-started those instances already.

There’s lots to think about here, so I’ll leave this post at this point and ponder this some more.