Public Syndication

Ahh… SEO Spam

Part of my day job involves being the technical contact for their website, which means we get lots of offers from people offering to put us on the “first page of Google”.

Hmm, last time I checked, the first page of Google was, strangely, Google. Somehow, I don’t think they outsource their SEO strategy to get there… they wrote the bloody code!

These emails go straight to Spamcop generally… and they send nastygrams to the people hosting the email servers they used. In some cases, I’ve taken the extraordinary step of blocking frequently abused hosts.

# Block Centrilogic and SmartMailer because they don't act on spam reports.
-A INPUT -s 173.240.14.0/24 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 199.43.203.0/24 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
# Block OVH because they don't act on spam reports.
# List taken from https://mxtoolbox.com/SuperTool.aspx?action=asn%3aAS16276&run=toolpage
-A INPUT -s 5.39.0.0/17 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 5.135.0.0/16 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 5.196.0.0/16 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.7.244.0/24 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.18.128.0/24 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.18.136.0/21 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.18.172.0/24 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.20.110.0/24 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.21.41.0/24 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.24.8.0/21 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.26.94.0/24 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.29.224.0/24 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.30.208.0/21 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
-A INPUT -s 8.33.96.0/21 -p tcp --dport 25 -j REJECT --reject-with icmp-host-prohibited
…

That is not an exhaustive list. Sorry to people who use OVH for hosting and were trying to contact VRT/CETA legitimately, but OVH have shown themselves to be grossly incompetent with regard to management of network abuse. Centrilogic/SmartMailer are more recent additions.

Of course, they keep trying, and thankfully, it takes longer for them to write the email than it does for me to deal with it. This doesn’t stop them claiming little gems like this:

Note: We are not spammers and are against spamming of any kind. If you are not interested then you can reply with a simple “NO”.

Errm, hate to disagree (actually no, in this case, I love disagreement)… but a few points:

Your sending me an unsolicited content…
… without my consent… (no listing in domain registration or scraping from a website is not consent)
… that is advertising a paid-for service or otherwise something you’re hoping to make money from…
… by electronic messaging.

That by definition is an Unsolicited Commercial Email… aka SPAM. If you claim to be an Australian business, you better have a look at this. If your ISP is complaining that you are abusing their services by sending spam, then perhaps you need to realise the people you are contacting are not interested! You have your NO.

Toy Synthesizer: I/O module, MkII

So some spare time today… I decide to construct a new I/O module to fix up the mistakes made with the previous iteration. Mainly:

the TVS diodes… going for one with a higher clamp voltage so it doesn’t smoke when 12V is applied
switching to a 4-pin connector on the output side, with pins for 0V, GPIO, DRAIN and +12V
fixing the pin-out on the input side so it matches the PCB.
rather than having jumper leads to make the boards separable, we’ll make one monolithic board that plugs into all 8 channels simultaneously with one long connector.

For the TVS diodes, I ordered some TPD2E007 in SOT23… thinking those would be a reasonable size for hand-soldering.

Now… how the bloody hell am I going to solder these little tiddlers? I had thought SOT-23 was about twice that size. Never mind, can’t un-buy them.

The circuit is pretty much identical to what came before. My MOSFETs and 4.7nF capacitors seem to have gone walkies, not sure where. No doubt the arrival of replacements will summon them back. I decided to use SMT for many parts on this build.

0805 resistors and veroboard aren’t a bad combo really, just have a sharp blade handy to cut the track where needed, and the resistor can straddle the gap made.

For the TVS diodes, the common pin is to ground, so I made a bus bar running vertically down the PCB and scoring the tracks either side. The common pins could be soldered to that, and the two I/O pins would straddle the division between each track. Aside from me getting some parts off-by-one at first, this went well.

The zener and schottky diodes of course, being through-hole, went on the other side of the PCB.

I still have to locate where my MOSFETs have gone, and I think I found some 12 ohm resistors (through-hole). I can use some 0805 1k resistors for the MOSFET gates. So that’s some MOSFETs and 4.7nF, probably 0805 size capacitors that need ordering in the new year.

Solar Cluster: 2 days and counting on solar…

So, I’m home now for the Christmas break… and the fan in my power supply decided it would take a Christmas break itself.

The power supply was purchased brand new in June… it still works as a power supply, but with the fan seized up, it represents an overheating risk. Unfortunately, the only real options I have are the Xantrex charger, which cooked my last batteries, or a 12V 20A linear PSU I normally use for my radio station. 20A is just a touch light-on, given the DC-DC converter draws 25A. It’ll be fine to provide a top-up, but I wouldn’t want to use it for charging up flat batteries.

Now, I can replace the faulty fan. However, that PSU is under warranty still, so I figure, back it goes!

In the meantime, an experiment. What happens if I just turn the mains off and rely on the batteries? Well, so far, so good. Saturday afternoon, the batteries were fully charged, I unplugged the mains supply. Battery voltage around 13.8V.

Sunday morning, battery was down to 12.1V, with about 1A coming in off the panels around 7AM (so 6A being drained from batteries by the cluster).

By 10AM, the solar panels were in full swing, and a good 15A was being pumped in, with the cluster drawing no more than 8A. The batteries finished the day around 13.1V.

This morning, batteries were slightly lower at 11.9V. Just checking now, I’m seeing over 16A flowing in from the panels, and the battery is at 13.2V.

I’m in the process of building some power meters based on NXP LPC810s and TI INA219Bs. I’m at two minds what to use to poll them, whether I use a Raspberry Pi I have spare and buy a case, PSU and some sort of serial interface for it… or whether I purchase a small industrial PC for the job.

The Technologic Systems TS-7670 is one that I am considering, given they’ll work over a wide range of voltages and temperatures, they have plenty of UARTs including RS-485 and RS-232, and while they ship with an old Linux kernel, yours truly has ported both U-Boot and the mainline Linux kernel. Yes, it’s ARMv5, but it doesn’t need to be a speed demon to capture lots of data, and they work just fine for Barangaroo where they poll Modbus (via pymodbus) and M-bus (via python-mbus).

2017/12/25 by Redhatter (VK4MSL) Computing Linux Development Projects Public Syndication Solar-powered Cloud Computing Thinktank 0

Solar Cluster: HA VM experiment using plain libvirt: no go

So, I have two compute nodes. I’ll soon have 32GB RAM in each one, currently one has 32GB and the other has its original 8GB… with 5 8GB modules on the way.

I’ve tested these, and they work fine in the nodes I have, they’ll even work along side the Kingston modules I already have, so one storage node will have a mixture. That RAM is expected to arrive on Monday.

Now, it’d be nice to have HA set up so that I can power down the still-to-be-upgraded compute node, and have everything automatically fire up on the other compute node. OpenNebula supports this. BUT I have two instances that are being managed outside of OpenNebula that I need to handle: one being the core router, the other being OpenNebula itself.

My plan was to use corosync. I have an identical libvirt config for both VMs, allowing me to move the VMs manually between the hosts. VM Disk storage is using RBDs on Ceph. Thus, HA by default.

As an experiment, I thought, what would happen if I fired up two instances of the VM that pointed to the same RBD image? I was expecting one of two things to happen:

The image would be locked by the first started image, locking out the second. One instance would boot, the other would fail to boot.
Both instances would boot… the split-brain scenario.

So, I created a libvirt domain on one node, slapped Ubuntu on there (I just wanted a basic OS for testing, so command line, nothing fancy). As that was installing, I dumped out the “XML config” and imported that to the second node, but didn’t start it yet.

Once I had the new VM booted on node 1, I booted it on node 2.

To my horror, it started booting, and booted straight to a log-in prompt. Great, I had manually re-created the split-brain scenario I specifically hoped to avoid. Thankfully, it is a throw-away VM specifically for testing this behaviour. To be sure, I logged in on both, then hard-resetted one. It boots to GRUB, then immediately GRUB goes into panic mode. I hard reset the other VM, it boots past GRUB, but then systemd goes into panic mode. This is expected: the two VMs are stomping on each others’ data oblivious to each others’ existence, a recipe for disaster.

So for this to work, I’m going to have to work on my fencing. I need to ensure beyond all possible doubt, that the VM is running in one place and one place ONLY.

libvirt supports VM hooks to do this, and there’s an example here, however this thread seems to suggest this is not a reliable way of doing things. RBD locking is what I hoped libvirt would do implicitly, but it seems not, and it appears that the locks are not removed when a client dies, which could lead to other problems.

A distributed lock manager would handle this, and this is something I need to research. Possibilities include HashiCorp Consul, Apache ZooKeeper, CoreOS etcd and Redis, among others. I can also try to come up with my own, perhaps built on PAXOS or Raft.

The state needs to only be kept in memory, persistence on disk is not required. It’s safe to assume that if the cluster doesn’t know about a VM, it isn’t running anywhere else. Once told of that VMs existence though, it should ensure only one instance runs at a time.

If a node loses contact with the remaining group, it should terminate everything it has, as it’s a fair bet, the others have noticed its absence and have re-started those instances already.

There’s lots to think about here, so I’ll leave this post at this point and ponder this some more.

2017/12/08 by Redhatter (VK4MSL) Computing Open Source OpenNebula Projects Public Syndication Solar-powered Cloud Computing Thinktank 0

Solar Cluster: Wondering why the effort in sourcing ECC RAM?

Seems I’ve vindicated my decision to chase ECC memory modules for my servers, inspite of ECC DDR3 SODIMMs being harder to find. (Pro tip, a 8GB ECC module will have an organisation of 1Gbit×72.)

Specifically mentioned, is that ECC memory is more resistant to these problems. (Thanks to Sebastian Pipping for forwarding this.)

2017/11/25 by Redhatter (VK4MSL) Computing Projects Public Syndication Solar-powered Cloud Computing 0

Solar Cluster: Solar back on…

So, this weekend I did plan to run from solar full time to see how it’d go.

Mother nature did not co-operate. I think there was about 2 hours of sunlight! This is what the 24 hour rain map looks like from the local weather radar (image credit: Bureau of Meteorology):

In the end, I opted to crimp SB50 connectors onto the old Redarc BCDC1225 and hook it up between the battery harness and the 40A power supply. It’s happily keeping the batteries sitting at about 13.2V, which is fine. The cluster ran for months off this very same power supply without issue: it’s when I introduced the solar panels that the problems started. With a separate controller doing the solar that has over-discharge protection to boot, we should be fine.

I also have mostly built-up some monitoring boards based on the TI INA219Bs hooked up to NXP LPC810s. I have not powered these up yet, plan is to try them out with a 1ohm resistor as the stand-in for the shunt and a 3V rail… develop the firmware for reporting voltage/current… then try 9V and check nothing smokes.

If all is well, then I’ll package them up and move them to the cluster. Not sure of protocols just yet. Modbus/RTU is tempting and is a protocol I’m familiar with at work and would work well for this application, given I just need to represent voltage and current… both of which can be scaled to fit 16-bit registers easy (voltage in mV, current in mA would be fine).

I just need some connectors to interface the boards to the outside world and testing will begin. I’ve ordered these and they’ll probably turn up some time this week.

2017/11/19 by Redhatter (VK4MSL) Computing Projects Public Syndication Solar-powered Cloud Computing 0

Solar Cluster: Suspicions with the charger…

So, at present I’ve been using a two-charger solution to keep the batteries at full voltage. On the solar side is the Powertech MP3735, which also does over-discharge protection. On the mains side, I’m using a Xantrex TC2012.

One thing I’ve observed is that the TC2012, despite being configured for AGM batteries, despite the handbook saying it charges AGM batteries to a maximum 14.3V, has a happy knack of applying quite high charging voltages to the batteries.

I’ve verified this… every meter I’ve put across it has reported it at one time or another, more than 15V across the terminals of the charger. I’m using SB50 connectors rated at 50A and short runs of 6G cable to the batteries. So a nice low-resistance path.

The literature I’ve read says 14.8V is the maximum. I think something has gone out of calibration!

This, and the fact that the previous set-up over-discharged the batteries twice, are the factors that lead to the early failure of both batteries.

The two new batteries (Century C12-105DA) are now sitting in the battery cases replacing the two Giant Energy batteries, which will probably find themselves on a trip to the Upper Kedron recycling facility in the near future.

The Century batteries were chosen as I needed the replacements right now and couldn’t wait for shipping. This just happened to be what SuperCheap Auto at Keperra sell.

The Giant Energy batteries took a number of weeks to arrive: likely because the seller (who’s about 2 hours drive from me) had run out of stock and needed to order them in (from China). If things weren’t so critical, I might’ve given those batteries another shot, but I really didn’t have the time to order in new ones.

I have disconnected the Xantrex TC2012. I really am leery about using it, having had one bad experience with it now. The replacement batteries cost me $1000. I don’t want to be repeating the exercise.

I have a couple of options:

Ditch the idea of using mains power and just go solar.
Dig out the Redarc BCDC1225 charger I was using before and hook that up to a regulated power supply.
Source a new 20A mains charger to hook in parallel with the batteries.
Hook a dumb fixed-voltage power supply in parallel with the batteries.
Hook a dumb fixed-voltage power supply in parallel with the solar panel.

Option (1) sounds good, but what if there’s a run of cloudy days? This really is only an option once I get some supervisory monitoring going. I have the current shunts fitted and the TI INA219Bs for measuring those shunts arrived a week or so back, just haven’t had the time to put that into service. This will need engineering time.

Option (2) could be done right now… and let’s face it, its problem was switching from solar to mains. In this application, it’d be permanently wired up in boost mode. Moreover, it’s theoretically impossible to over-discharge the batteries now as the MP3735 should be looking after that.

Option (3) would need some research as to what would do the job. More money to spend, and no guarantee that the result will be any better than what I have now.

Option (4) I’m leery about, as there’s every possibility that the power supply could be overloaded by inrush current to the battery. I could rig up a PWM circuit in concert with the monitoring I’m planning on putting in, but this requires engineering time to figure out.

Option (5) I’m also leery about, not sure how the panels will react to having a DC supply in parallel to them. The MP3735 allegedly can take an input DC supply as low as 9V and boost that up, so might see a 13.8V switchmode PSU as a solar panel on a really cloudy day. I’m not sure though. I can experiment, plug it in and see how it reacts. Research gives mixed advice, with this Stack Exchange post saying yes and this Reddit thread suggesting no.

I know now that the cluster averages about 7A. In theory, I should have 30 hours capacity in the batteries I have now, if I get enough sun to keep them charged.

This I think, will be a week-end experiment, and maybe something I’ll try this weekend. Right now, the cluster itself is running from my 40A switchmode PSU, and for now, it can stay there.

I’ll let the solar charger top the batteries up from the sun this week. With no load, the batteries should be nice and full, ready come Friday evening, when I can, gracefully, bring the cluster down and hook it up to the solar charger load output. If, at the end of the weekend, it’s looking healthy, I might be inclined to leave it that way.

2017/11/13 by Redhatter (VK4MSL) Computing Projects Public Syndication Solar-powered Cloud Computing 0

Solar Cluster: Remember kids, over-discharge protection matters!

So, yesterday I had this idea of building an IC storage unit to solve a problem I was facing with the storage of IC tubes, and to solve an identical problem faced at HSBNE.

It turned out to be a longish project, and by 11:30PM, I had gotten far, but still had a bit of work to do. Rather than slog it out overnight, I thought I’d head home and resume it the next day. Instead of carting the lot home, and back again, I decided to leave my bicycle trailer with all the project gear and my laptop, stashed at HSBNE’s wood shop.

By the time I had closed up the shop and gotten going, it was after midnight. That said, the hour of day was a blessing: there was practically no traffic, so I road on the road most of the way, including the notorious Kingsford-Smith Drive. I made it home in record time: 1 hour, 20 minutes. A record that stood until later this morning coming the other way, doing the run in 1:10.

I was exhausted, and was thinking about bed, but wheeling the bicycle up the drive way and opening the garage door, I caught a whiff. What’s that smell? Sulphur??

Remember last post I had battery trouble, so isolated the crook battery and left the “good” one connected?

The charger was going flat chat, and the battery case was hot! I took no chances, I switched the charger off at the wall and yanked the connection to the battery wiring harness. I grabbed some chemical handling gloves and heaved the battery case out. Yep, that battery was steaming! Literally!

This was the last thing I wanted to deal with at nearly 2AM on a Sunday morning. I did have two new batteries, but hadn’t yet installed them. I swapped the one I had pulled out last fortnight, and put in one of the new ones. I wanted to give them a maintenance charge before letting them loose on the cluster.

The other dud battery posed a risk though, with the battery so hot and under high pressure, there was a good chance that it could rupture if it hadn’t already. A shower of sulphuric acid was not something I wanted.

I decided there was nothing running on the cluster that I needed until I got up later today, so left the whole kit off, figuring I’d wait for that battery to cool down.

5AM, I woke up, checked the battery, still warm. Playing it safe, I dusted off the 40A switchmode PSU I had originally used to power the Redarc controller, and plugged it directly into the cluster, bypassing the batteries and controller. That would at least get the system up.

This evening, I get home (getting a lift), and sure enough, the battery has cooled down, so I swap it out with another of the new batteries. One of the new batteries is charging from the mains now, and I’ll do the second tomorrow.

See if you can pick out which one is which…

2017/11/12 by Redhatter (VK4MSL) Computing Projects Public Syndication Solar-powered Cloud Computing 0

Using gdb as a serial log console

So, I’m doing some development on a Cortex M3-based device with access to only one serial port, and that serial port is doing double-duty as serial console and polling a Modbus energy meter. How do I get log messages out?

My code actually implements the stubs to direct stdout and stderr transparently to the serial port, however this has to go to /dev/null when the Modbus port is in use. That said, _write_r still gets called, in my code, it is possible to set a breakpoint inside the _write_r function when traffic is identified for the console.

As it happens, gdb can be told to not only break there, but to perform a series of actions. In my case serial.c:659 is the file and line number inside an if branch that handles the console code. Setting up gdb to print this data out requires the following statements:

(gdb) break serial.c:659
(gdb) commands
Type commands for breakpoint(s) 3, one per line.
End with a line saying just "end".
>set ((char*)buf)[cnt] = 0
>print (char*)buf
>continue
>end
(gdb) c

The result:

Breakpoint 3, _write_r (ptr=, fd=0, buf=0x200068c0, cnt=78) at /home/stuartl/vrt/projects/widesky/hub/hal/src/serial.c:659
659			if (serial_console_target.port) {
$51 = 0x200068c0 "/home/stuartl/vrt/projects/widesky/hub/hal/demo/main.c:226 Registration sent\r\n"

Breakpoint 3, _write_r (ptr=, fd=0, buf=0x200068c0, cnt=46) at /home/stuartl/vrt/projects/widesky/hub/hal/src/serial.c:659
659			if (serial_console_target.port) {
$52 = 0x200068c0 "Received NTP time is Mon Nov  6 04:13:41 2017\n"

Breakpoint 3, _write_r (ptr=, fd=0, buf=0x200068c0, cnt=2) at /home/stuartl/vrt/projects/widesky/hub/hal/src/serial.c:659
659			if (serial_console_target.port) {
$53 = 0x200068c0 "\r\n"

Breakpoint 3, _write_r (ptr=, fd=0, buf=0x200068c0, cnt=89) at /home/stuartl/vrt/projects/widesky/hub/hal/src/serial.c:659
659			if (serial_console_target.port) {
$54 = 0x200068c0 "/home/stuartl/vrt/projects/widesky/hub/hal/demo/main.c:115 Registration timeout: 30 sec\r\n"

Breakpoint 3, _write_r (ptr=, fd=0, buf=0x200068c0, cnt=83) at /home/stuartl/vrt/projects/widesky/hub/hal/src/serial.c:659
659			if (serial_console_target.port) {
$55 = 0x200068c0 "/home/stuartl/vrt/projects/widesky/hub/hal/demo/main.c:130 Select source address:\r\n"

Breakpoint 3, _write_r (ptr=, fd=0, buf=0x200068c0, cnt=53) at /home/stuartl/vrt/projects/widesky/hub/hal/src/serial.c:659
659			if (serial_console_target.port) {
$56 = 0x200068c0 " ? fdde:ad00:beef:0:0:ff:fe00:a400 Pref=Y Valid=Y\r\n"

Breakpoint 3, _write_r (ptr=, fd=0, buf=0x200068c0, cnt=15) at /home/stuartl/vrt/projects/widesky/hub/hal/src/serial.c:659
659			if (serial_console_target.port) {
$57 = 0x200068c0 " ? Selected\r\n"

---Type  to continue, or q  to quit---
Breakpoint 3, _write_r (ptr=, fd=0, buf=0x200068c0, cnt=78) at /home/stuartl/vrt/projects/widesky/hub/hal/src/serial.c:659
659			if (serial_console_target.port) {
$58 = 0x200068c0 "/home/stuartl/vrt/projects/widesky/hub/hal/demo/main.c:226 Registration sent\r\n"

Not as nice as having a dedicated port, but better than nothing.

2017/11/06 by Redhatter (VK4MSL) Computing Public Syndication 0

Solar Cluster: WTF

So… with the new controller we’re able to see how much current we’re getting from the solar. I note they omit the solar voltage, and I suspect the current is how much is coming out of the MPPT stage, but still, it’s more information than we had before.

With this, we noticed that on a good day, we were getting… 7A.

That’s about what we’d expect for one panel. What’s going on? Must be a wiring fault!

I’ll admit when I made the mounting for the solar controller, I didn’t account for the bend radius in the 6gauge wire I was using, and found it was difficult to feed it into the controller properly. No worries, this morning at 4AM I powered everything off, took the solar controller off, drilled 6 new holes a bit lower down, fed the wires through and screwed them back in.

Whilst it was all off, I decided I’d individually charge the batteries. So, right-hand battery came first, I hook the mains charger directly up and let ‘er rip. Less than 30 minutes later, it was done.

So, disconnect that, hook up the left hand battery. 45 minutes later the charger’s still grinding away. WTF?

Feel the battery… it is hot! Double WTF?

It would appear that this particular battery is stuffed. I’ve got one good one though, so for now I pull the dud out and run with just the one.

I hook everything up, do some final checks, then power the lot back up.

Things seem to go well… I do my usual post-blackout dance of connecting my laptop up to the virtual instance management VLAN, waiting for the OpenNebula VM to fire up, then log into its interface (because we’re too kewl to have a command line tool to re-start an instance), see my router and gitea instances are “powered off”, and instruct the system to boot them.

They come up… I’m composing an email, hit send… “Could not resolve hostname”… WTF? Wander downstairs, I note the LED on the main switch flashing furiously (as it does on power-up) and a chorus of POST beeps tells me the cluster got hard-power-cycled. But why? Okay, it’s up now, back up stairs, connect to the VLAN, re-start everything again.

About to send that email again… boompa! Same error. Sure enough, my router is down. Wander downstairs, and as I get near, I hear the POST beeps again. Battery voltage is good, about 13.2V. WTF?

So, about to re-start everything, then I lose contact with my OpenNebula front-end. Okay, something is definitely up. Wander downstairs, and the hosts are booting again. On a hunch I flick the off-switch to the mains charger. Klunk, the whole lot goes off. There’s no connection to the battery, and so when the charger drops its power to check the battery voltage, it brings the whole lot down.

WTF once more? I jiggle some wires… no dice. Unplug, plug back in, power blinks on then off again. What is going on?

Finally, I pull right-hand battery out (the left-hand one is already out and cooling off, still very warm at this point), 13.2V between the negative terminal and positive on the battery, good… 13.2V between negative and the battery side of the isolator switch… unscrew the fuse holder… 13.2V between fuse holder terminal and the negative side… but 0V between negative side on battery and the positive terminal on the SB50 connector.

No apparent loose connections, so I grab one of my spares, swap it with the existing fuse. Screw the holder back together, plug the battery back in, and away it all goes.

This is the offending culprit. It’s a 40A 5AG fuse. Bought for its current carrying capacity, not for the “bling factor” (gold conductors).

If I put my multimeter in continuance test mode and hold a probe on each end cap, without moving the probes, I hear it go open-circuit, closed-circuit, open-circuit, closed-circuit. Fuses don’t normally do that.

I have a few spares of these thankfully, but I will be buying a couple more to replace the one that’s now dead. Ohh, and it looks like I’m up for another pair of batteries, and we will have a working spare 105Ah once I get the new ones in.

On the RAM front… the firm I bought the last lot through did get back to me, with some DDR3L ECC SO-DIMMs, again made by Kingston. Sounded close enough, they were 20c a piece more (AU$855 for 6 vs $AU864.50).

Given that it was likely this would be an increasing problem, I thought I’d at least buy enough to ensure every node had two matched sticks in, so I told them to increase the quantity to 9 and to let me know what I owe them.

At first they sent me the updated invoice with the total amount (AU$1293.20). No problems there. It took a bit of back-and-forth before I finally confirmed they had the previous amount I sent them. Great, so into the bank I trundle on Thursday morning with the updated invoice, and I pay the remainder (AU$428.70).

Friday, I get the email to say that product was no longer available. They instead, suggested some Crucial modules which were $60 a piece cheaper. Well, when entering a gold mine, one must prepare themselves for the shaft.

Checking the link, I found it: these were non-ECC. 1Gbit×64, not 1Gbit×72 like I had ordered. In any case I was over it, I fired back an email telling them to cancel the order and return the money. I was in no mood for Internet shopper Russian Roulette.

It turns out I can buy the original sticks through other suppliers, just not in the quantities I’m after. So I might be able to buy one or two from a supplier, I can’t buy 9. Kingston have stopped making them and so what’s left is whatever companies have in stock.

So I’ll have to move to something else. It’d be worth buying one stick of the original type so I can pair it with one of the others, but no more than that. I’m in no mood to do this in a few years time when parts are likely to be even harder to source… so I think I’ll bite the bullet and go 16GB modules. Due to the limits on my debit card though, I’ll have to buy them two at a time (~$900AUD each go). The plan is:

Order in two 16GB modules and an 8GB module… take existing 8GB module out of one of the compute nodes and install the 16GB modules into that node. Install the brand new 8GB module and the recovered 8GB module into two of the storage nodes. One compute node now has 32GB RAM, and two storage nodes are now upgraded to 16GB each. Remaining compute node and storage node each have 8GB.
Order in two more 16GB modules… pull the existing 8GB module out of the other compute node, install the two 16GB modules. Then install the old 8GB module into the remaining storage node. All three storage nodes now have 16GB each, both compute nodes have 32GB each.
Order two more 16GB modules, install into one compute node, it now has 64GB.
Order in last two 16GB modules, install into the other compute node.

Yes, expensive, but sod it. Once I’ve done this, the two nodes doing all the work will be at their maximum capacity. The storage nodes are doing just fine with 8GB, so 16GB should mean there’s plenty of RAM for caching.

As for virtual machine management… I’m pretty much over OpenNebula. Dealing with libvirt directly is no fun, but at least once configured, it works! OpenNebula has a habit of not differentiating between a VM being powered off (as in, me logging into the guest and issuing a shutdown), and a VM being forcefully turned off by the host’s power getting yanked!

With one, there should be some event fired off by libvirt to tell OpenNebula that the VM has indeed turned itself off. With the latter, it should observe that one moment the VM is there, and next it isn’t… the inference being that it should still be there, and that perhaps that VM should be re-started.

This could be a libvirt limitation too. I’ll have to research that. If it is, then the answer is clear: we ditch libvirt and DIY. I’ll have to research how I can establish a quorum and schedule where VMs get put, but it should be doable without the hassle that OpenNebula has been so far, and without going to the utter tedium that is OpenStack.

2017/11/05 by Redhatter (VK4MSL) Computing Open Source OpenNebula OpenStack Projects Public Syndication Solar-powered Cloud Computing 0

Public Syndication

Ahh… SEO Spam

Toy Synthesizer: I/O module, MkII

Solar Cluster: 2 days and counting on solar…

Solar Cluster: HA VM experiment using plain libvirt: no go

Solar Cluster: Wondering why the effort in sourcing ECC RAM?

Solar Cluster: Solar back on…

Solar Cluster: Suspicions with the charger…

Solar Cluster: Remember kids, over-discharge protection matters!

Using gdb as a serial log console

Solar Cluster: WTF

Site Login and Registration

Calendar

Pages

Categories

Recent Comments

Blogroll

Mastodon

Public Syndication

Site Login and Registration

Calendar

Pages

Categories

Recent Comments

Blogroll

Tags

Mastodon