solar-cluster

Solar Cluster: Adding Solar

So we’ve got a free weekend where there’ll be two of us to do a solar installation… thus the parts have now been ordered for that installation.

First priority will be to get the panels onto the roof and bring the feed back to where the cluster lives.  The power will come from 3 12V 120W solar panels that will be mounted on the roof over the back deck.  Theoretically these can push about 7A of current with a voltage of 17.6V.

We’ve got similar panels to these on the roof of a caravan, those ones give us about 6A of current when there’s bright sunlight.  The cluster when going flat-chat needs about 10A to run, so with three panels in broad daylight, we should be able to run the cluster and provide about 8A to top batteries up with.

We’ll be running individual feeds of 8-gauge DC cable from each panel down to a fused junction box under the roof on the back deck.  From there, it’ll be 6-gauge DC cable down to the cluster’s charge controller.

Now, we have a relay that switches between mains-sourced DC and the solar, and right now it’s hard-wired to be on when the mains supply is switched on.

I’m thinking that the simplest solution for now will be to use a comparator with some hysteresis.  That is, an analogue circuit.  When the solar voltage is greater than the switchmode DC power supply, we use solar.  We’ll need the hysteresis to ensure the relay doesn’t chatter when the solar voltage gets near the threshold.

The other factor here is that the solar voltage may get as high as 22V or so, thus resistor dividers will be needed both sides to ensure the inputs to the comparator are within safe limits.

The current consumption of this will be minimal, so a LM7809 will probably do the trick for DC power regulation to power the LM311.  If I divide all inputs by 3, 22V becomes ~7.3V, giving us plenty of head room.

I can then use the built-in NPN to drive a P-channel MOSFET that controls the relay.  The relay would connect between MOSFET drain and 0V, with the MOSFET source connecting to the switchmode PSU (this is where the relay connects now).

The solar controller also connects its control line to the MOSFET drain.  To it, the MOSFET represents the ignition switch on a vehicle, starting the engine would connect 12V to the relay and the solar controller control input, connecting the controller’s DC input to the vehicle battery and telling the controller to boost this voltage up for battery charging purposes.

By hooking it up in this manner, and tuning the hysteresis on the comparator, we should be able to handle automatic switch-over between mains power and solar with the minimum of components.

Solar Cluster: OpenNebula, DNS shennanigans and network documentation

OpenNebula is running now… I ended up re-loading my VM with Ubuntu Linux and throwing OpenNebula on that.  That works… and I can debug the issue with Gentoo later.

I still have to figure out corosync/heartbeat for two VMs, the one running OpenNebula, and the core router.  For now, the VMs are only set up to run on one node, but I can configure them on the other too… it’s then a matter of configuring libvirt to not start the instances at boot, and setting up the Linux-HA tools to figure out which node gets to fire up which VM.

The VM hosts are still running Gentoo however, and so far I’ve managed to get them to behave with OpenNebula.  A big part was disabling the authentication in libvirt, otherwise polkit generally made a mess of things from OpenNebula’s point of view.

That, and firewalld had to be told to open up ports for VNC/spice… I allocated 5900-6900… I doubt I’ll have that many VMs.

Last weekend I replaced the border router… previously this was a function of my aging web server, but now I have an ex-RAAF-base Advantech UNO-1150G industrial PC which is performing the routing function.  I tried to set it up with Gentoo, and while it worked, I found it wasn’t particularly stable due to limited memory (it only has 256MB RAM).  In the end, I managed to get OpenBSD 6.1/i386 running sweetly, so for now, it’s staying that way.

While the AMD Geode LX800 is no speed demon, a nice feature of this machine is it’s happy with any voltage between 9 and 32V.

The border router was also given the responsibility of managing the domain: I did this by installing ISC BIND9 from ports and copying across the config from Linux.  This seemed to be working, and so I left it.  Big mistake, turns out bind9 didn’t think it was authoritative, and so refused to handle AXFRs with my slaves.

I was using two different slave DNS providers, puck.nether.net and Roller Network, both at the time of subscription being freebies.  Turns out, when your DNS goes offline, puck.nether.net responds by disabling your domain then emailing you about it.  I received that email Friday morning… and so I wound up in a mad rush trying to figure out why BIND9 didn’t consider itself authoritative.

Since I was in a rush, I decided to tell the border router to just port-forward to the old server, which got things going until I could look into it properly.  It took a bit of tinkering with pf.conf, but eventually got that going, and the crisis was averted.  Re-enabling the domains on puck.nether.net worked, and they stayed enabled.

It was at that time I discovered that Roller Network had decided to make their slave DNS a paid offering.  Fair enough, these things do cost money… At first I thought, well, I’ll just pay for an account with them, until I realised their personal plans were US$5/month.  My workplace uses Vultr for hosting instances of their WideSky platform for customers… and aside from the odd hiccup, they’ve been fine.  US$5/month VPS which can run almost anything trumps US$5/month that only does secondary DNS, so out came the debit card for a new instance in their Sydney data centre.

Later I might use it to act as a caching front-end and as a secondary mail exchanger… but for now, it’s a DIY secondary DNS.  I used their ISO library to install an OpenBSD 6.1 server, and managed to nut out nsd to act as a secondary name server.

Getting that going this morning, I was able to figure out my DNS woes on the border router and got that running, so after removing the port forward entries, I was able to trigger my secondary DNS at Vultr to re-transfer the domain and debug it until I got it working.

With most of the physical stuff worked out, it was time to turn my attention to getting virtual instances working.  Up until now, everything running on the VM was through hand-crafted VMs using libvirt directly.  This is painful and tedious… but for whatever reason, OpenNebula was not successfully deploying VMs.  It’d get part way, then barf trying to set up 802.1Q network interfaces.

In the end, I knew OpenNebula worked fine with bridges that were already defined… but I didn’t want to have to hand-configure each VLAN… so I turned to another automation tool in my toolkit… Ansible:

- hosts: compute
  tasks:
  - name: Configure networking
    template: src=compute-net.j2 dest=/etc/conf.d/net
# …
- hosts: compute
  tasks:
# …
  - name: Add symbolic links (instance VLAN interfaces)
    file: src=net.lo dest=/etc/init.d/net.bond0.{{item}} state=link
    with_sequence: start=128 end=193
  - name: Add symbolic links (instance VLAN bridges)
    file: src=net.lo dest=/etc/init.d/net.vlan{{item}} state=link
    with_sequence: start=128 end=193
# …
  - name: Make services start at boot (instance VLAN bridges)
    command: rc-update add net.vlan{{item}} default
    with_sequence: start=128 end=193 

That’s a snippet of the playbook… and it basically creates symbolic links from Gentoo’s net.lo for all the VLAN ports and bridges, then sets them up to start at boot.

In the compute-net.j2 file referenced above, I put in the following to enumerate all the configuration bits.

# Instance VLANs
{% for vlan in range(128,193) %}
config_vlan{{vlan}}="null"
config_bond0_{{vlan}}="null"
rc_net_vlan{{vlan}}_need="net.bond0.{{vlan}}"
{% endfor %}
# …
vlans_bond0="5 8 10{% for vlan in range(128,193) %} {{vlan}} {% endfor %}248 249 250 251 252"
vlans_bond1="253"
# …
# Instance VLANs
{% for vlan in range(128,193) %}
bridge_vlan{{vlan}}="bond0.{{vlan}}"
{% endfor %} 

The start and end ranges are a little off, but it saved a lot of work.

This naturally took a while for OpenRC to bring up… but it worked. Going back to OpenNebula, I told it what bridges to use, and before long I had my first instance… an OpenBSD router to link my personal VLAN to the DMZ.

I spent a bit of time re-working my routing tables after that… in fact, my network is getting big enough now I have to write some details down.  I spent a few hours documenting the effort:

That’s page 1 of about 15… yes my hand is sore… but at least now should I get run over by a bus, others have a fighting chance doing anything with the network without my technical input.

Solar Cluster: OpenNebula on Gentoo/musl… no go?

So, I had a go at getting OpenNebula actually running on my little VM.  Earlier I had installed it to the /opt directory by hand, and today, I tried launching it from that directory.

To get the initial user set up, you have to create the ${ONE_LOCATION}/.one/one_auth (in my case; /opt/opennebula/.one/one_auth) with the username and password for your initial user separated by a colon.  The idea here is that is used to initially create the user, you then change the password once you’re successfully logged in.

That got me a little further, but then it still fails… turns out it doesn’t like some options specified by default in the configuration file.   I commented out some options, and that got me a little further again.  oned started, but then went into lala land, accepting connections but then refusing to answer queries from other tools, leaving them to time out.

I’ve since managed to package OpenNebula into a Gentoo Ebuild, which I have now published in a dedicated overlay.  I was able to automate a lot of the install process this way, but I was still no closer.

On a hunch, I tried installing the same ebuild on my laptop.  Bingo… in mere moments, I was staring at the OpenNebula Sunstone UI in my browser, it was working.  The difference?  My laptop is running Gentoo with the standard glibc C library, not musl.  OpenNebula compiled just fine on musl, but perhaps differences in how musl does threads or some such (musl takes a hard-line POSIX approach) is causing a deadlock.

So, I’m rebuilding the VM using glibc now.  We shall see where that gets us.  At least now I have the install process automated. 🙂

Solar Cluster: OpenNebula Front-end setup

So, the front-end for OpenNebula will be a VM, that migrates between the two compute nodes in a HA arrangement.  Likewise with the core router, and border router, although I am also tossing up trying again with the little Advantech UNO-1150G I have laying around.

For now, I’ve not yet set up the HA part, I’ll come to that.  There are guides for using libvirt with corosync/heartbeat, most also call up DR:BD as the block device for the VM, but we will not be using this as our block device (Rados Block Device) is already redundant.

To host OpenNebula, I’ll use Gentoo with musl-libc since that’ll shrink the footprint down just a little bit.  We’ll run it on a MariaDB back-end.

Since we’re using musl, you’ll want to install layman and the musl overlay as not all packages build against musl out-of-the-box.  Also install gentoolkit, as you’ll need to set USE flags, and euse makes this easy:

# emerge layman
# layman -L
# layman -a musl
# emerge gentoolkit

Now that some basic packages are installed, we need to install OpenNebula’s prerequisites. They tell you in amongst these is xmlrpc-c. BUT, they don’t tell you that it needs support for abyss: and the scons build system they use will just give you a cryptic error saying it couldn’t find xmlrpc. The answer is not, as suggested, to specify the path to xmlrpc-c-config, which happens to be in ${PATH} anyway, as that will net the same result, and break things later when you fix the real issue.

# euse -p dev-util/xmlrpc-c -E abyss

Now we can build the dependencies… this isn’t a full list, but includes everything that Gentoo ships in repositories, the remaining Ruby gems will have to be installed separately.

# emerge --ask dev-lang/ruby dev-db/sqlite dev-db/mariadb \
dev-ruby/sqlite3 dev-libs/xmlrpc-c dev-util/scons \
dev-ruby/json dev-ruby/sinatra dev-ruby/uuidtools \
dev-ruby/curb dev-ruby/nokogiri

With that done, create a user account for OpenNebula:

# useradd -d /opt/opennebula -m -r opennebula

Now you’re set to build OpenNebula itself:

# tar -xzvf opennebula-5.4.0.tar.gz
# cd opennebula-5.4.0
# scons mysql=yes

That’ll run for a bit, but should succeed. At the end:

# ./install -d /opt/opennebula -u opennebula -g opennebula

There’s about where I’m at now… the link in the README for further documentation is a broken link, here is where they keep their current documentation.

Solar Cluster: Networking

So, having got some instances going… I thought I better sort out the networking issues proper.  While it was working, I wanted to do a few things:

  1. Bring a dedicated link down from my room into the rack directly for redundancy
  2. Define some more VLANs
  3. Sort out the intermittent faults being reported by Ceph

I decided to tackle (1) first.  I have two 8-port Cisco SG-200 switches linked via a length of Cat5E that snakes its way from our study, through the ceiling cavity then comes up through a small hole in the floor of my room, near where two brush-tail possums call home.

I drilled a new hole next to where the existing cable entered, then came the fun of trying to feed the new cable along side the old one.  First attempt had the cable nearly coil itself just inside the cavity.  I tried to make a tool to grab the end of it, but it was well and truly out of reach.  I ended up getting the job done by taping the cable to a section of fibreglass tubing, feeding that in, taping another section of tubing to that, feed that in, etc… but then I ran out of tubing.

Luckily, a rummage around, and I found some rigid plastic that I was able to tape to the tubing, and that got me within a half-metre of my target.  Brilliant, except I forgot to put a leader cable through for next time didn’t I?

So more rummaging around for a length of suitable nylon rope, tape the rope to the Cat5E, haul the Cat5E out, then grab another length of rope and tape that to the end and use the nylon rope to haul everything back in.

The rope should be handy for when I come to install the solar panels.

I had one 16-way patch panel, so wound up terminating the rack-end with that, and just putting a RJ-45 on the end in my room and plugging that directly into the switch.  So on the shopping list will be some RJ-45 wall jacks.

The cable tester tells me I possibly have brown and white-brown switched, but never mind, I’ll be re-terminating it properly when I get the parts, and that pair isn’t used anyway.

The upshot: I now have a nice 1Gbps ring loop between the two SG-200s and the LGS326 in the rack.  No animals were harmed in the formation of this ring, although two possums were mildly inconvenienced.  (I call that payback for the times they’ve held the Marsupial Olympics at 2AM when I’m trying to sleep!)

Having gotten the physical layer sorted out, I was able to introduce the upstairs SG-200 to the new switch, then remove the single-port LAG I had defined on the downstairs SG-200.  A bit more tinkering going, and I had a nice redundant set-up: setting my laptop to ping one of the instances in the cluster over WiFi, I could unplug my upstairs trunk, wait a few seconds, plug it back in, wait some more, unplug the downstairs trunk, wait some more again, then plug in back in again, and not lose a single ICMP packet.

I moved my two switches and my AP over to the new management VLAN I had set up, along side the IPMI interfaces on the nodes.  The SG-200s were easy, aside from them insisting on one port being configured with a PVID equal to the management VLAN (I guess they want to ensure you don’t get locked out), it all went smoothly.

The AP though, a Cisco WAP4410N… not so easy.  In their wisdom, and unlike the SG-200s, the management VLAN settings page is separate from the IP interface page, so you can’t change both at the same time.  I wound up changing the VLAN, only to find I had locked myself out of it.  Much swearing at the cantankerous AP and wondering how could someone overlook such a fundamental requirement!  That, and the switch where the AP plugs in, helpfully didn’t add the management VLAN to the right port like I asked of it.

Once that was sorted out, I was able to configure an IP on the old subnet and move the AP across.

That just left dealing with the intermittent issues with Ceph.  My original intention with the cluster was to use 802.3AD so each node had two 2Gbps links.  Except: the LGS326-AU only supports 4 LAGs.  For me to do this, I need 10!

Thankfully, the bonding support in the Linux kernel has several other options available.  Switching from 802.3ad to balance-tlb, resolved the issue.

slaves_bond0="enp0s20f0 enp0s20f1"
slaves_bond1="enp0s20f2 enp0s20f3"
config_bond0="null"
config_bond1="null"
config_enp0s20f0="null"
config_enp0s20f1="null"
config_enp0s20f2="null"
config_enp0s20f3="null"
rc_net_bond0_need="net.enp0s20f0 net.enp0s20f1"
rc_net_bond1_need="net.enp0s20f2 net.enp0s20f3"
mode_bond0="balance-tlb"
mode_bond1="balance-tlb"

I am now currently setting up a core router instance (with OpenBSD 6.1) and a OpenNebula instance (with Gentoo AMD64/musl libc).

Solar Cluster: First virtual instances running

So, since my last log, I’ve managed to tidy up the wiring on the cluster, making use of the plywood panel at the back to mount all my DC power electronics, and generally tidying everything up.

I had planned to use a SB50 connector to connect the cluster up to the power supply, so made provisions for this in the wiring harness. Turns out, this was not necessary, it was easier in the end to just pull apart the existing wiring and hard-wire the cluster up to the charger input.

So, I’ve now got a spare load socket hanging out the front, which will be handy if we wind up with unreliable mains power in the near future since it’s a convenient point to hook up 12V appliances.

There’s a solar power input there ready, and space to the left of that to build a little control circuit that monitors the solar voltage and switches in the mains if needed. For now though, the switching is done with a relay that’s hard-wired on.

Today though, I managed to get the Ceph clients set up on the two compute nodes, and while virt-manager is buggy where it comes to RBD pools. In particular, adding a RBD storage pool doesn’t work as there’s no way to define authentication keys, and even if you have the pool defined, you find that trying to use images from that pool causes virt-manager to complain it can’t find the image on your local machine. (Well duh! This is a known issue.)

I was able to find a XML cheat-sheet for defining a domain in libvirt, which I was then able to use with Ceph’s documentation.

A typical instance looks like this:

<domain type='kvm'>
  <!-- name of your instance -->
  <name>instancename</name>
  <!-- a UUID for your instance, use `uuidgen` to generate one -->
  <uuid>00ec9b97-c49a-45f8-befe-f74ad6bde2fe</uuid>
  <memory>524288</memory>
  <vcpu>1</vcpu>
  <os>
    <type arch="x86_64">hvm</type>
  </os>
  <clock sync="utc"/>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <disk type='network' device='disk'>
      <source protocol='rbd' name="poolname/image.vda">
        <!-- the hostnames or IPs of your Ceph monitor nodes -->
        <host name="s0.internal.network" />
        <host name="s1.internal.network" />
        <host name="s2.internal.network" />
      </source>
      <target dev='vda'/>
      <auth username='libvirt'>
        <!-- the UUID here is what libvirt allocated when you did
	    `virsh secret-define foo.xml`, use `virsh secret-list`
	    if you've forgotten what that is. -->
        <secret type='ceph' uuid='23daf9f8-1e80-4e6d-97b6-7916aeb7cc62'/>
      </auth>
    </disk>
    <disk type='network' device='cdrom'>
      <source protocol='rbd' name="poolname/image.iso">
        <!-- the hostnames or IPs of your Ceph monitor nodes -->
        <host name="s0.internal.network" />
        <host name="s1.internal.network" />
        <host name="s2.internal.network" />
      </source>
      <target dev='hdd'/>
      <auth username='libvirt'>
        <secret type='ceph' uuid='23daf9f8-1e80-4e6d-97b6-7916aeb7cc62'/>
      </auth>
    </disk>
    <interface type='network'>
      <source network='default'/>
      <mac address='11:22:33:44:55:66'/>
    </interface>
    <graphics type='vnc' port='-1' keymap='en-us'/>
  </devices>
</domain>

Having defined the domain, you can then edit it at will in virt-manager. I was able to switch the network interface over to using virtio, plop it on a bridge so it was wired up to the correct VLAN and start the instance up.

I’ve since managed to migrate 3 instances over, namely an estate database, Brisbane Area WICEN’s OwnCloud site, and my own blog.

These are sufficient to try the system out. I’m already finding these instances much more responsive, using raw Ceph even, than the original server.

My next move I think will be to see if I can get corosync/heartbeat to manage a HA VM instance. That is, if one of the compute nodes goes offline, the instance restarts on the other compute node.

Two services come to mind where HA is concerned: terminating the PPPoE link for our Internet, and a virtual management node for a higher-level system such as OpenNebula. OpenNebula really needs something semi-HA, since it really gets its knickers in a twist if the master node goes down. I also want my border router to be HA, since I won’t necessarily be around to migrate it to a different node.

Everything else, well I suspect OpenNebula can itself manage those, and long term the instances I just liberated today from my old box, will become instances within OpenNebula.

The other option is I dip my toe into OpenStack (again), since it is inherently HA by design, but it is also a royal pain to get working.

Solar Cluster: Rack installed in-situ

So, there’s some work still to be done, for example making some extension leads for the run between the battery link harness, load power distribution and the charger… and to generally tidy things up, but it is now up and running.

On the floor, is the 240V-12V power supply and the charger, which right now is hard-wired in boost mode. In the bottom of the rack are the two 105Ah 12V AGM batteries, in boxes with fuses and isolation switches.

The nodes and switching is inside the rack, and resting on top is the load power distribution board, which I’ll have to rewire to make things a little neater. A prospect is to mount some of this on the back.

I had a few introductions to make, introducing the existing pair of SG-200 switches to the newcomer and its VLANs, but now at least, I’m able to SSH into the nodes, access the IPMI BMC and generally configure the whole box and dice.

With the exception of the later upgrade to solar, and the aforementioned wiring harness clean-ups, the hardware-side of this dual hardware/software project, is largely complete, and this project now transitions to being a software project.

The plan from here:

  • Update the OSes… as all will be a little dated. (I might even blow away and re-load.)
  • Get Ceph storage up and running. It actually should be configured already, just a matter of getting DNS hostnames sorted out so they can find eachother.
  • Investigating the block caching landscape: when I first started the project at work, it was a 3-horse race between Facebook’s FlashCache, bcache and dmcache. Well, FlashCache is no more, replaced by EnhancedIO, and I’m not sure about the rest of the market. So this needs researching.
  • Management interfaces: at my workplace I tried Ganeti, OpenNebula and OpenStack. This again, needs re-visiting. OpenNebula has moved a long way from where it was and I haven’t looked at the others in a while. OpenStack had me running away screaming, but maybe things have improved.

Solar Cluster: Power distribution harnesses

So, having got the rack mostly together, it is time to figure out how to connect everything.

I was originally going to have just one battery and upgrade later… but when it was discovered that the battery chosen was rather sick, the decision was made that I’d purchase two new batteries. So rather than deferring the management of multiple batteries, I’d have to deal with it up-front.

Rule #1 with paralleling batteries: don’t do it unless you have to. In a perfect world, you can do it just fine, but reality doesn’t work that way. There’s always going to be an imbalance that upsets things. My saving grace is that my installation is fixed, not mobile.

I did look at alternatives, including diodes (too much forward voltage drop), MOSFET switching (complexity), relay switching (complexity again, plus contact wear), and DIY uniselectors. Since I’m on a tight deadline, I decided, stuff it, I’ll parallel them.

That brings me to rule #2 about paralleling batteries: keep everything as close to matched as possible. Both batteries were bought in the same order, and hopefully are from the same batch. Thus, characteristics should be very close. The key thing here, I want to keep cable lengths between the batteries, load and charger, all equal so that the resistances all balance out. That, and using short runs of thick cables to minimise resistance.

I came up with the following connection scheme:

You’ll have to forgive the poor image quality here. On reflection, photographing a whiteboard has always been challenging.

Both batteries are set up in an identical fashion: 40A fuse on the positive side, cable from the negative side, going to an Andersen SB50/10. (Or I might put the fuse on the negative side … haven’t decided fully yet, it’ll depend on how much of each colour wire I have.) The batteries themselves are Giant Power 105Ah 12V AGM batteries. These are about as heavy as I can safely manage, weighing about 30kg each.

The central harness is what I built this afternoon, as I don’t yet have the fuse holders for the two battery harnesses.

The idea being that the resistance between the charger and each battery should be about the same. Likewise, the resistance between the load and each battery should be about the same

The load uses a distribution box and a bus bar. You’ve seen it before, but here’s how it’s wired up… pretty standard:

You might be able to make out the host names there too (periodic table naming scheme, why, because they’re Intel Atoms) … the 5 nodes are on the left and the two switches to the right of the distribution box. I have 3 spare positions.

In heavy black is the 0V bus bar.

This is what I’ve been spending much of my pondering, doing. Part of this harness is already done as it was installed that way in the car, the bit that’s missing is the circuit to the left of the relay that actually drives it. Redarc intended that the ignition key switch would drive the relay, I’ll be exploiting this feature.

Some time this week, I hope to make up the wiring harnesses for the two batteries, and get some charge into them as they’ve sat around for the past two months in their boxes steadily discharging, so I’d be better to get a charger onto them sooner rather than later.

The switch-over circuit can wait for now: just hard-wire it to the mains DC feed for now since there’s no solar yet. The principle of operation is that the comparator (an LM311) compares the solar voltage to a reference (derived from a 5V regulator) and kicks in when the voltage is high enough. (How high? No idea, maybe ~18V?). When that happens, it outputs a logic high signal that turns off the MOSFET. When too low, it pulls the MOSFET gate low, turning it on.

The MOSFET (a P-channel) provides the “ignition key switch” signal to the BCDC1225, fooling it into thinking it is connected to vehicle power, and the charger will boost as needed. The key being that the BCDC1225 makes the decision as to whether the battery needs charging, and how much charge.

By bolting together off-the-shelf parts, we should have something that I can source replacements for should the smoke escape, and there’s no high voltages to deal with.

Solar Cluster: Rack taking shape

Well, it’s been a while since I last updated this project. Lots have been due to general lethargy, real life and other pressures.

This equipment is being built amongst other things to host my websites, mail server, and as a learning tool for managing clustered computing resources. As such, yes, I’ll be putting it down as a work expense… and it was pointed out to me that it needed to be in operation before I could start claiming it on tax. So, with 30th June looming up soon, it was time I pulled my finger out and got it going.

At least running on mains. As for the solar bit, well we will be doing that too, my father recently sent me this email (line breaks for readability):

Subject: Why you're about to pay through the nose for power - ABC News
 (Australian Broadcasting Corporation)
To: Stuart Longland
From: David Longland
http://www.abc.net.au/news/2017-06-19/…
   …why-youre-about-to-pay-through-the-nose-for-power/8629090

Hi Stuart,

This is why I am keen to see your cluster up and running.  Our power 
bill is about $300 every 3 months, a lift in price by 20% represents 
$240pa hike.

Dad

Umm, yeah… good point. Our current little server represents a small portion of our base-load power… refrigeration being the other major component.

I ordered the rack and batteries a few months back, and both have been sitting here, still in the boxes they were shipped in, waiting for me to get to and put them together. My father got fed up of waiting and attacked the rack, putting it together one evening… and last night, we worked together on putting a back on the rack using 12mm plywood.

We also fitted the two switches, mounting the smaller one to the lid of the main switch using multiple layers of double-sided tape.

I wasn’t sure at first where the DIN rail would mount. I had intended to screw it to a piece of 2×4″ or similar, and screw that to the back plane. We couldn’t screw the DIN rail directly to the back plane because the nodes need to be introduced to the DIN rail at an angle, then brought level to attach them.

Considering the above, we initially thought we’d bolt it to the inner run of holes, but two problems presented themselves:

  1. The side panels actually covered over those holes: this was solved with a metal nibbling tool, cutting a slot where the hole is positioned.
  2. The DIN rail, when just mounted at each end, lacked the stability.

I measured the gap between the back panel and the DIN rail location: 45mm. We didn’t have anything that was that width which we could use as a mounting. We considered fashioning a bracket out of some metal strip, but bending it right could be a challenge without the right tools. (and my metalwork skills were never great.)

45mm + 3mm is 48mm… or 4× plywood pieces. We had plenty of off-cut from the back panel.

Using 4 pieces of the plywood glued together and clamped overnight, I made a mounting to which I could mount the DIN rail for the nodes to sit on. This afternoon, I drilled the pilot holes and fitted the screws for mounting that block, and screwed the DIN rail to it.

At the far ends, I made spacers from 3mm aluminium metal strap. The result is not perfect, but is much better than what we had before.

I’ve wired up the network cables… checking the lengths of those in case I needed to get longer cables. (They just fit… phew! $20 saved.) and there is room down the bottom for the batteries to sit. I’ll make a small 10cm cable to link the management network up to the appropriate port on the main switch, then I just need to run cables to the upstairs and downstairs switches. (In fact, there’s one into the area already.)

On the power front… my earlier experiments had ascertained the suitability of the Xantrex charger that we had spare. The charger is a smart charger, and so does various equalisation and balancing cycles, thus gets mightily confused if you suddenly disconnect the battery from it by way of a MOSFET. A different solution presented itself though.

My father has a solar set-up in the back of his car… there’s a 12V 120W panel on the roof, and that provides power to a battery system which powers an amateur radio station and serves as an auxiliary battery. There’s a diode arrangement that allows charging from the vehicle battery system.

In an effort to try and upgrade it, he bought a Redarc BCDC1225 in-vehicle MPPT charger. This charger can accept power from either the 12V mains supply in a vehicle, or from a “12V” solar panel. The key here, is it relies on a changeover relay to switch between the two, and this is where it wasn’t quite suitable for my father’s needs: it assumed that if the vehicle ignition was on, you wanted to charge from the vehicle, not from solar.

He wanted it to switch to whichever source was more plentiful, and had thought the unit would drive the relay itself. Having read the manual, we now know the signal they tell you to connect to the relay coil is there to tell the charger which source it is plugged into, not for it to drive the relay.

The plan is therefore:

  • use a 240V→12V AC-DC switch-mode power supply to provide the “vehicle mains” DC input to the charger.
  • measure the voltage seen at the solar input with a comparator and switch over when it is above some pre-defined voltage (use hysteresis to ensure it doesn’t oscillate)
  • use the output to drive a P-channel MOSFET attached to the “vehicle mains”, which drives the relay.

Solar Cluster: Selecting batteries and sources

I’ve been doing quite a bit of thinking on this. Solid-state works but suffers from voltage drop. Relays work but either require the coil to be energised constantly (~1W load) unless you look for latching relays, for which 30A units are hard to come by.

These look promising though. A latching relay is nice since I only need to pulse the coil, not hold it on indefinitely.

That got me thinking what else can I use to switch power? The ideal for me is something that has practically no voltage drop and remembers its state without power. A latching relay fits this requirement. So does a uniselector or stepping switch. Those were commonplace in telephone exchanges years ago, but have since gone the way of the dodo as semiconductor technology replaced it.

The nice thing about a uniselector though for my application is you can switch between N points, instead of just two like a regular relay. So if I buy a third battery, I can wire it up to the uniselector, and have it switch the compute load between the batteries. Likewise, I can connect a charger to the battery most in need of a charge. MCU measures battery voltages, picks the battery with highest voltage to run the load, and the lowest voltage to get a charge. Easy.

That got me thinking… can I make a uniselector? Well of course I can! I basically need to make a rotary switch that can revolve around indefinitely. The shaft of the switch would then be turned by a DC motor.

The stator of a N-way switch would have N+1 pads, one which is the “common”, and the other N would be to each selection. The common pad would be a 180° arc, the others would be 180°/N.

The rotor would feature two brushes 180° apart with a wire connecting them. It is free to move vertically, but must rotate with the shaft, a spring between a nut on the end of the shaft and the rotor applies tension to keep the rotor pressed firmly against the stator.

The interface between rotor and stator features some triangular grooves, so that when the rotor is turned, it pushes it away from the stator, breaking contact. When the rotor passes a critical point, the spring pressing the rotor against these grooves makes the rotor “want” to continue turning until it hits the bottom of the groove, at which point it “sinks” down towards the stator and eventually makes contact again.

Visually, it looks like this:

A small microswitch mounted on the stator could tell us when touch-down takes place, if we use the normally-closed contact to power the motor it will automatically stop the motor when the next position is reached. We then just need to override that open switch by applying a pulse to get things moving.

Power is only needed when we want to change the selector switch. This should be simple enough to fabricate here out of plywood. I don’t have a 3D printer, but you could do it with one of those very easily.

The nature of this switch makes it a break-before-make switch, which has a downside when using it to select which battery to use: there’s a momentary break in power.

I can use diodes to carry the current temporarily. If I run a high-current diode from each battery to the output via a current sensor. If the current sensor measures current flowing through the diodes whilst a battery is selected, then we know that battery is lower than the others by at least the diode voltage drop, and we should consider switching.