computing

Ta ta, eBay

Entered into an eBay contact form.

Hi, Just a short note.

I am closing my account: the form that asks why didn’t really capture the true reason why I’m closing.

It’s not quite “identity theft”, but it is security-related.

I haven’t been using my eBay account, so I thought I’d set the password to something nice and *strong*. On the password change form, I noticed a 20-character maximum limit.

This was red flag no. 1.

Then I pasted a randomised password from a generator. The site complained I had forbidden characters.

This was red flag no. 2.

By placing limits on the size of password and its content, it is clear to me that eBay is *not* serious about making its systems truly secure, and that breaches like the one experienced recently will be a recurring event.

By hiding behind “proprietary encryption” it isn’t even serious about reassuring the public: good crypto doesn’t need secret algorithms to work well.

As there’s now very little I buy off eBay, I feel the time has come to say goodbye. If you ever do get your act together, I might consider returning, but until then, farewell.

WTF are you doing, Microsoft?

I just checked my email, and see this:

Return-Path: < …>
X-Original-To: …
Delivered-To: …
Received: by atomos.longlandclan.yi.org (Postfix, from userid 0)
	id 67204200E27C; Sun, 13 Apr 2014 23:05:55 +1000 (EST)
Subject: [Fail2Ban] SSH: banned 138.91.144.167 from atomos
Date: Sun, 13 Apr 2014 13:05:55 +0000
From: Fail2Ban < …>
To: …
Message-Id: <20140413130556.67204200E27C@atomos.longlandclan.yi.org>

Hi,

The IP 138.91.144.167 has just been banned by Fail2Ban after
5 attempts against SSH.


Here is more information about 138.91.144.167:


#
# ARIN WHOIS data and services are subject to the Terms of Use
# available at: https://www.arin.net/whois_tou.html
#


#
# The following results may also be obtained via:
# http://whois.arin.net/rest/nets;q=138.91.144.167?showDetails=true&showARIN=false&ext=netref2
#

NetRange:       138.91.0.0 - 138.91.255.255
CIDR:           138.91.0.0/16
OriginAS:       
NetName:        MICROSOFT
NetHandle:      NET-138-91-0-0-1
Parent:         NET-138-0-0-0-0
NetType:        Direct Assignment
RegDate:        2011-06-22
Updated:        2013-08-20
Ref:            http://whois.arin.net/rest/net/NET-138-91-0-0-1


OrgName:        Microsoft Corp
OrgId:          MSFT-Z
Address:        One Microsoft Way
City:           Redmond
StateProv:      WA
PostalCode:     98052
Country:        US
RegDate:        2011-06-22
Updated:        2013-10-03
Comment:        To report suspected security issues specific to 
Comment:        traffic emanating from Microsoft online services, 
Comment:        including the distribution of malicious content 
Comment:        or other illicit or illegal material through a 
Comment:        Microsoft online service, please submit reports 
Comment:        to:
Comment:        * https://cert.microsoft.com.  
Comment:        
Comment:        For SPAM and other abuse issues, such as Microsoft 
Comment:        Accounts, please contact:
Comment:        * abuse@microsoft.com.  
Comment:        
Comment:        To report security vulnerabilities in Microsoft 
Comment:        products and services, please contact:
Comment:        * secure@microsoft.com.  
Comment:        
Comment:        For legal and law enforcement-related requests, 
Comment:        please contact:
Comment:        * msndcc@microsoft.com
Comment:        
Comment:        For routing, peering or DNS issues, please 
Comment:        contact:
Comment:        * IOC@microsoft.com
Ref:            http://whois.arin.net/rest/org/MSFT-Z

OrgTechHandle: MRPD-ARIN
OrgTechName:   Microsoft Routing, Peering, and DNS
OrgTechPhone:  +1-425-882-8080 
OrgTechEmail:  IOC@microsoft.com
OrgTechRef:    http://whois.arin.net/rest/poc/MRPD-ARIN

OrgAbuseHandle: MAC74-ARIN
OrgAbuseName:   Microsoft Abuse Contact
OrgAbusePhone:  +1-425-882-8080 
OrgAbuseEmail:  abuse@microsoft.com
OrgAbuseRef:    http://whois.arin.net/rest/poc/MAC74-ARIN


#
# ARIN WHOIS data and services are subject to the Terms of Use
# available at: https://www.arin.net/whois_tou.html
#

Regards,

Fail2Ban
atomos ~ # grep 138.91.144.167 /var/log/auth.log ; zgrep 138.91.144.167 /var/log/auth.log-20140*.gz
Apr 13 23:05:40 atomos sshd[3143]: Did not receive identification string from 138.91.144.167
Apr 13 23:05:40 atomos sshd[3144]: SSH: Server;Ltype: Version;Remote: 138.91.144.167-1025;Protocol: 2.0;Client: JSCH-0.1.51
Apr 13 23:05:41 atomos sshd[3144]: SSH: Server;Ltype: Kex;Remote: 138.91.144.167-1025;Enc: aes128-ctr;MAC: hmac-md5;Comp: none [preauth]
Apr 13 23:05:41 atomos sshd[3144]: SSH: Server;Ltype: Authname;Remote: 138.91.144.167-1025;Name: support [preauth]
Apr 13 23:05:48 atomos sshd[3144]: Invalid user support from 138.91.144.167
Apr 13 23:05:48 atomos sshd[3144]: Postponed keyboard-interactive for invalid user support from 138.91.144.167 port 1025 ssh2 [preauth]
Apr 13 23:05:49 atomos sshd[3203]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=138.91.144.167 
Apr 13 23:05:51 atomos sshd[3144]: error: PAM: Authentication failure for illegal user support from 138.91.144.167
Apr 13 23:05:51 atomos sshd[3144]: Failed keyboard-interactive/pam for invalid user support from 138.91.144.16  port 1025 ssh2
Apr 13 23:05:51 atomos sshd[3144]: Postponed keyboard-interactive for invalid user support from 138.91.144.167 port 1025 ssh2 [preauth]
Apr 13 23:05:51 atomos sshd[3236]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=138.91.144.167 
Apr 13 23:05:54 atomos sshd[3144]: error: PAM: Authentication failure for illegal user support from 138.91.144.167
Apr 13 23:05:54 atomos sshd[3144]: Failed keyboard-interactive/pam for invalid user support from 138.91.144.16  port 1025 ssh2
Apr 13 23:05:54 atomos sshd[3144]: Received disconnect from 138.91.144.167: 3: com.jcraft.jsch.JSchException: Auth cancel [preauth]

Seriously, some dodgy ISP in Russia or Asia having a crack, I’ll ignore it. But a big company like you? I expect better behaviour.

User interfaces

Well, it seems user interfaces are going around in circles again. Just recently, I’ve been meeting up with the Windows 8.1 style UI. One of my colleagues at work uses this as his main OS, we also have a VM image of Windows Server 2012 R2 (a 180-day evaluation).  One thing I will say, it’s a marginal improvement on Windows 8.  But is it all that new?

win2012r2

Now, the start screen is one of the most divisive aspects of the Windows 8 UI.  Some love it, some hate it.  Me?  Well the first incarnation of it was an utter pigsty, with no categorisation or organisation.  That has improved somewhat now that you can organise tiles into groups.

But hang on, where have I seen that before?

winnt31-1

That looks… familiar… Let me scroll over to the right a bit…

winnt31-2Ohh, how rude of me!  Allow me to introduce you to an old acquaintance…

winnt31-3This, is Windows NT 3.1.  What one might call the first real ancestor of Windows 8.1.  Everything that came before was built on DOS, Windows 8.1 basically calls itself Windows NT 6.3 beneath the UI.  This is the great-great-great-great-great-great-great-great-grandparent of today’s Windows desktop OS.  Yeah, it’s been that many releases: 8.0, 7, Vista, XP, 2000, NT 4, NT 3.51 and NT 3.5 all sit in between.

Now the question comes up, how is it different?  Sure there are similarities, and I will admit I exaggerated these to make a point.  The Program Manager normally looks more like this:

winnt31-4The first thing you’ll notice that the Program manager (the closest you’ll get to the “start screen”) doesn’t normally occupy the full screen, although you can if you ask it to.  In fact, it will share the screen with other things you might have open:

winnt31-5This is good for both novice and power user alike.  The power user is likely wanting to launch some other application and will be thinking about the job at hand.  The novice might have instructions open that they’re trying to follow.  In neither case, do you want to interrupt them with an in-your-face full screen launcher; a desktop computer is not a smartphone.

The shortcoming of the original Program Manager interface in terms of usability was that you were limited to two levels of hierarchy, with the top-level containing only program groups, and the lower level containing only program icons.  The other shortcoming was in switching applications, you either had to know the ALT+TAB shortcut or minimise/restore full-screen applications so you could see the other applications or their icons.  There was no status area either.

Windows 95 improved things in that regard, the start menu could show arbitrary levels and the taskbar provided both a status area and a screen region dedicated to window selection.  As the installer put it, changing applications was “as easy as changing channels on TV”.  (Or words to that effect.  I’ve ran the Windows 95 installer enough times to memorise two OEM keys off-by-heart but not well enough to remember every message word-for-word.)  Windows NT 4.0 inherited this same interface.

This remained largely unchanged until Windows XP, which did re-arrange some aspects of the Start menu.  Windows 2000/ME made some retrograde changes with respect to network browsing (that is, “My Network Places” vs “Network Neighborhood”[sic]) but the general desktop layout was the same.  Windows Vista was the last version to offer the old “classic” menu with it disappearing altogether in Windows 7.  The Vista/7 start menu, rather than opening out across the desktop as you navigated, confined itself to a small corner of the screen.  Windows 8 is the other extreme.

Windows 8, the start screen takes over.  There’s no restore button on this “window”, it’s all or nothing.  Now, before people point me out to be some kind of Microsoft-hater (okay, I do endulge in a bit of Microsoft-bashing, but they ask for it sometimes!) I’d like to point out there are some good points.

The Start screen isn’t a bad initial start point when you first log in, or if you’ve just closed the last application and wish to go someplace else.  It’s a good dashboard, with the ability to display status information.  It’s also good for touch use as the UI elements are well suited to manipulation by even the most chubbiest of digits.

What it is not, is a good launcher whilst you’re in the middle of things.  A more traditional Start menu with an item to open the Start Screen would be better here.  It also does a very poor job of organising applications: something Windows has never been good at.

So I guess you’ve all worked it out by now that I’m no fan of the Windows UIs in general, and this is especially true of Windows 8.1/2012 R2.  Not many UIs have a lower approval rating in my opinion.  The question now of course, is “how would I do things different”?  All very well to have an opinion about what I don’t like, and so far the IT industry has pushed back on Microsoft and said: “We don’t like this”.  The question is, how do they improve.

Contrary to what some might think, my answer is not to roll the UI back to what came in Windows 7.  The following are some mock-ups of some ideas I have decided to share.

Meet a desktop that for now we’ll call the “Skin The Cat” desktop.  The idea is a desktop that provides more than one way to perform typical actions to suit different usage scenarios.  A desktop that follows the mantra… “there’s more than one way to skin a cat”.  (And who hasn’t wanted to go skin one at some point!)  No code has been written, the screenshots here are entirely synthetic, using The Gimp to draw what doesn’t exist.

mockup-desktopSo here, we see the STC desktop, with two text editors going and a web browser (it’s a quiet day for me).  I’ve shown this on a 640×480 screen just for the sake of reducing the amount of pixel art, but this would represent a typical wide-screen form-factor display.

You’ll notice a few things:

  • Rather than having the “task bar” down the bottom, I’ve put it up the left side.
  • The launch bar (as I’ll call it) has two columns, one shows current applications, the other shows that application’s windows.
  • Over on the right, we have our status area, here depicting volume control, WiFi, battery and mail icons, and a clock down the bottom.
  • Bottom left is a virtual desktop pager.  (MacOS X users would call these spaces.)

Why would I break Microsoft tradition and use this sort of layout?  My usual preference is to have the launch bar up the top of the screen rather than down the side, as that’s where your application menus are.  However, monitors are, for better or worse, getting wider rather than taller.  So while there’s plenty of space width-wise, stacking bars horizontally leads to one being forced to peer at their work through the proverbial letter-box slot.  This leaves the full height for the application.

I group the windows by the application that created them.  All windows have a border and title bar, with buttons for maximizing, iconifying (minimising, for you Windows folk) and closing, as well as a window options menu button, which also displays the application’s icon.

So how does this fit the “skin the cat” mantra?  Well the proposal is to support multiple input methods, and to be able to switch between them on the fly.  Thus, icons should be big enough you can get your thumb onto them with reasonable accuracy, most common window operations should be accompanied by keyboard actions, and the controls for a window should be reasonably apparent without needing special guidance.  More importantly, the window management should stay out of the way until the user explicitly requests its attention by means of:

  • clicking on any title bar buttons or panel icons with the mouse
  • tapping on the title bar of a window or on panel icons with one’s finger
  • pressing an assigned key on the keyboard

Tapping on the title bar (which is big enough to get one’s finger on) would enable other gestures momentarily to allow easier manipulation with one’s fingers.  The title bar and borders would be enlarged, and the window manager would respond to flick (close), pinch (restore/iconify or maximise) and slide (move) gestures.

Tapping the assigned keyboard key or keystroke (probably the logo/”Windows”/command key, inference being you wish to do give the window manager a command) would bring up a pop-up menu with common window operations, as well as an option for the launch menu.

Now this is good and well, but how about the launcher?  That was one of my gripes with Windows for many years after all… Well let’s have a look at the launch menu.

mockup-launcherSo here we’ve got a user that uses a few applications a lot.  The launcher provides three views, and the icons are suggestive about what will happen.  What we see is the un-grouped view, which is analogous to the tiles in Windows 8, with the exception that these are just program launchers, no state is indicated here.

The other two views are the category view, and the full-screen view.  These are identical except that the latter occupies the full screen like the present start screen in Windows 8 does.  Optionally the category view (when not full screen) could grow horizontally as the tree is traversed.  Here, we see the full-screen version.  Over on the right is the same list of frequent applications shown earlier.  As you navigate, this would display the program items belonging to that category.

mockup-launcher-fullscrn

Here, we choose the Internet category.  There aren’t any program icons here, but the group does have a few sub-groups.  The “Internet” category name is displayed vertically to indicate where we are.mockup-launcher-fullscrn-inetLooking under Web Browsers, we see this group has no sub-groups, but it does have two program icons belonging to it.  The sub-group, “Web Browsers” is displayed vertically.  Tapping/clicking “Internet” would bring us back up to that level.mockup-launcher-fullscrn-inet-web

In doing this, we avoid a wall of icons, as is so common these days on Android and iOS.  The Back and Home links go up one level, and up to the top respectively.

The launcher would remember what state it was last in for next time you call it up.  So if you might organise specialised groups for given tasks, and have in them the applications you need for that task.  It’d remember you opening that group last time so you’d be able to recall applications as needed.

Window management is another key feature that needs to be addressed.  The traditional GUI desktop has been a cascading one, where dialogue boxes and windows are draw overlapping one another.  Windows 8 was a throw-back to Windows 2.1 in some ways in that the “Modern” (god I hate that name) interface is fundamentally a tiling one.

There are times when this mode of display is the most appropriate, and indeed, I use tiling a lot.  I did try Awesome, an automatic tiling window manager for a while, but found the forced style didn’t suit me.  A user should be able to define divisions on a given desktop and have windows tiled within them.  The model would be similar to how spreadsheet cells are resized and optionally merged.  A user might initiate tiled mode by defining the initial number of rows and columns (which can be added or subtracted from later).  They then can “snap” individual windows to groups of cells, resizing the divisions as required.  Resizing and moving a window would then move in units of one “cell”.

At the request of the user, individual windows can then be “floated” from the cellular layout allowing them to be cascaded.  Multiple windows may also occupy a cell group, with the title bar becoming “tabbed’ (much like in FluxBox) to allow selection of the windows within that cell group.

I haven’t got any code for the above, this is just some ideas I’ve been thinking about for a while, particularly this afternoon.  If I get motivated, we may see a “skin the cat desktop” project come into existence.  For now though, I’ll continue to do battle with what I use now.  If someone (commercial or open-source motivated) wants to try and tackle the above, you’re welcome to.  However, the fact the ideas are documented here means there is prior art (in fact, there is prior art in some of the above), and I’d appreciate a little attribution.

New keys

Hi all,

Well, this week has given the security world quite a shake-up as a result of the OpenSSL Heartbleed bug.  I’ve got no reason to suspect my key has actually been compromised, but two things made me decide to revoke my old key and issue a new one:

  • It was quite old, dating from 2011.  I think I’ll be replacing them every 2~3 years from now on.  (new one after two years, old one revoked a year later)
  • I’ve got no way to really know if my system was penetrated or not, better to be safe than sorry.

The following is my new key (minus the UIDs, I don’t want to encourage the spam):

pub   4096R/10BDE3B7 2014-04-11 [expires: 2017-04-10]
Key fingerprint = 9804 EB67 F914 61DE 967D  A1B0 4DFA 1914 10BD E3B7

Right now I’m going through and changing some of my passwords, just to be on the safe side. Thankfully I use a few different passwords (I have “classes” of passwords which I use for different tasks) so those getting revealed is less of a problem than it may have been.

The other good news is that my SSH keys haven’t been compromised, I have been gradually replacing my old DSA one with new RSA ones, and hadn’t copied the private key to my server yet (although the public one was in authorized_keys).  That, and my CA key used for this site, was kept on another computer, so it should be okay.  I’ve generated new SSL keys just in case though.

The new GPG key is signed with my old one, you can trust that signature if you wish.  Don’t trust anything signed by it after this date however.

Now, to re-train my muscle memory with these new passwords.  (sigh)

Ceph and FlashCache in OpenNebula

Well, lately I’ve been doing some development work with OpenNebula.

We’ve recently deployed a 3-node Ceph cluster which we intend to use as our back-end storage for numerous things: among them being VM storage.  Initially I thought the throughput would be “good enough”, 3 hosts each with gigabit links supplying VM hosts with gigabit backhaul links.

It’d be comparable to typical HDDs, or so I thought.  What I didn’t count on in particular was the random-read latency introduced by round-tripping over the network and overheads.  When I tried Ceph with just libvirt, things weren’t too bad, I was close to saturating my 1Gbps link.  Put two VMs on and again, things hummed along.  Not blistering fast mind you but reasonable.

I got OpenNebula talking to it easy enough.  We’re running the stable version: 4.4.  There are a few things I learned about the way OpenNebula uses Ceph:

  • OpenNebula uses v1-format RBDs (the Ceph default actually)
  • Since v1 RBDs don’t support COW clones, instance images are copied.
  • Copying a 160GB image in triplicate over gigabit Ethernet takes a while, and brought our little cluster to a crawl.

Naturally, we’re looking into beefing up the network links and CPUs on the storage nodes, but I’ve also been looking at ways to reduce the load on the back-end cluster.  One is through caching.  There are a couple of projects out there which allow you to combine two types of storage, using a smaller, faster block device to act as a cache for a larger, slower device.  Two which immediately come to mind: FlashCache and bcache.

bcache is on the TODO list, it has a few more knobs and dials to be able to play with, and shares a single cache device with multiple back-end devices, so might yet be worth investing time in.

Sébastian Han posted a guide on doing RBD caching using FlashCache, and so my work has largely been based on this initial work.  I’ve been hacking up a OpenNebula datastore management and transfer management driver which harnesses FlashCache and the newer v2 RBD format to produce a flexible storage subsystem for OpenNebula.

The basic concept is simple enough:

  • Logical Volume Manager, is used to allocate slices of a SSD to use as cache for back-end RBDs.
  • For non-persistent images, a new copy-on-write clone of the base image is created
  • A flashcache composite device is produced using the LVM volume as cache and the RBD as the backend
  • KVM/QEMU/Xen uses this composite device like a regular disk

The initial attempt worked well for Linux VMs, read performance initially would be between 20MB/sec and 120MB/sec depending on network/storage cluster load.  Subsequent reads would then exceed 240MB/sec.  Write performance was limited to what the cluster could do, unless you used writeback mode at which point speed picked up dramatically.

Windows proved to be a puzzle, it seems some Windows images have an odd way of accessing the disk, and this impacts performance badly.  In many cases, the images were of a sparse nature, with most of the content being in the first 8GB.  So I made sure to allocate 8GB chunks of my SSD, and performed what I call pre-caching: seeding the contents of the SSD with the initial 8GB (or however big the SSD partition is) of the image.

That picks up the initial boot performance by a big margin, at the cost of the image taking a little longer to deploy in the PROLOG stage.

For those who are interested, some early code is available via git.

bcache might be worth a look-in as it has read-ahead caching.  I haven’t done so yet.  I’d like to split the caching subsystem out and have cache drivers much like we have for datastore managers and transfer managers alike.  The same concept would work for iSCSI/CLVM storage or Gluster storage as it does for Ceph.

ceph and stgt

Hi all,

This is more a note to myself on how to configure stgt to talk to a Ceph rbd. Everyone seems to recommend patching tgt-admin: this is simply not necessary. The challenge is the lax way that tgt-admin parses the configuration file.

My scenario: VMWare ESXi virtual machine host, needing to use storage on Ceph.
I have 3 storage nodes running ceph-mon and ceph-osd daemons. They also have a version of tgtd that supports Ceph. (See the ceph-extras repository.)

The /etc/tgt/conf.d/${CLIENT}.conf configuration file. (I’m putting all the targets for ${CLIENT} here.)

# Target naming: iqn.yyyy-mm.backwards.domain.your:client.target
# where yyyy-mm: year and month of target creation
# backwards.domain.your: Your domain name; written backwards.
# client.target: A name for the target, since it's for one client here I name it
# as the client's host name then give the rest some descriptive title.
<target iqn.2014-02.domain.my:my-client.my-target-name>
    driver iscsi
    bs-type rbd
    backing-store pool-name/rbd-name
    initiator-address ip.of.my.client
</target>

For better or worse, I run the tgt daemon on the Ceph nodes themselves. Multipath I’m not sure about at this point, I’ve set up the targets on all of my Ceph nodes so I can connect to any, but I have not tested this yet.

To enable that target:

# tgt-admin -v -e

Then to verify:

# tgt-admin -s

You should see your LUNs listed.

apt repository hell whilst installing mariadb-galera-server on Ubuntu

Hi all,

Not often I have a whinge about something, but this problem has been bugging me of late more than somewhat.  I’m in the process of setting up an OpenStack cluster at work.  Now, as the underlying OS we’ve chosen Ubuntu Linux which is fine.  Ubuntu is a quite stable, reliable and well supported platform.

One of my pet peeves though, is when some package manager decides to get lazy.  Now, those of us who have been around the Linux scene have probably discovered RPM dependency hell… and the smug Debian users who tell us that Debian doesn’t do this.

Ho ho, errm… no, when APT wants to go into dummy mode, it does so with style:

Nov 12 05:32:27 in-target: Setting up python3-update-manager (1:0.186.2) ...
Nov 12 05:32:27 in-target: Setting up python3-distupgrade (1:0.192.13) ...
Nov 12 05:32:27 in-target: Setting up ubuntu-release-upgrader-core 
(1:0.192.13) ...
Nov 12 05:32:27 in-target: Setting up update-manager-core (1:0.186.2) ...
Nov 12 05:32:27 in-target: Processing triggers for libc-bin ...
Nov 12 05:32:27 in-target: ldconfig deferred processing now taking place
Nov 12 05:32:27 in-target: Processing triggers for initramfs-tools ...
Nov 12 05:32:27 in-target: Processing triggers for ca-certificates ...
Nov 12 05:32:27 in-target: Updating certificates in /etc/ssl/certs... 
Nov 12 05:32:29 in-target: 158 added, 0 removed; done.
Nov 12 05:32:29 in-target: Running hooks in /etc/ca-certificates/update.d....
Nov 12 05:32:29 in-target: done.
Nov 12 05:32:29 in-target: Processing triggers for sgml-base ...
Nov 12 05:32:29 pkgsel: installing additional packages
Nov 12 05:32:29 in-target: Reading package lists...
Nov 12 05:32:29 in-target: 
Nov 12 05:32:29 in-target: Building dependency tree...
Nov 12 05:32:30 in-target: 
Nov 12 05:32:30 in-target: Reading state information...
Nov 12 05:32:30 in-target: 
Nov 12 05:32:30 in-target: openssh-server is already the newest version.
Nov 12 05:32:30 in-target: Some packages could not be installed. This may 
mean that you have
Nov 12 05:32:30 in-target: requested an impossible situation or if you are 
using the unstable
Nov 12 05:32:30 in-target: distribution that some required packages have not 
yet been created
Nov 12 05:32:30 in-target: or been moved out of Incoming.
Nov 12 05:32:30 in-target: The following information may help to resolve the 
situation:
Nov 12 05:32:30 in-target: 
Nov 12 05:32:30 in-target: The following packages have unmet dependencies:
Nov 12 05:32:30 in-target:  mariadb-galera-server : Depends: 
mariadb-galera-server-5.5 (= 5.5.33a+maria-1~raring) but it is not going to 
be installed

Mmmm, great, not going to be installed. May I ask why not? No, I’ll just drop to a shell and do it myself then.

Nov 12 05:32:30 in-target: E: Unable to correct problems, you have held 
broken packages.

Now this is probably one of my most hated things about computing, is when a software package accuses YOU of doing something that you haven’t. Excuse me… I have held broken packages? I simply performed a fresh install then told you to do an install!

So let’s have a closer look.

Nov 12 05:32:30 main-menu[20801]: WARNING **: Configuring 'pkgsel' failed 
with error code 100
Nov 12 05:32:30 main-menu[20801]: WARNING **: Menu item 'pkgsel' failed.
Nov 12 05:37:38 main-menu[20801]: INFO: Modifying debconf priority limit from 
'high' to 'medium'
Nov 12 05:37:38 debconf: Setting debconf/priority to medium
Nov 12 05:37:38 main-menu[20801]: DEBUG: resolver (ext2-modules): package 
doesn't exist (ignored)
Nov 12 05:37:40 main-menu[20801]: INFO: Menu item 'di-utils-shell' selected
~ # chroot /target
chroot: can't execute '/bin/network-console': No such file or directory
~ # chroot /target bin/bash

We give it a shot ourselves to see the error more clearly.

root@test-mgmt0:/# apt-get install mariadb-galera-server
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 mariadb-galera-server : Depends: mariadb-galera-server-5.5 (= 
5.5.33a+maria-1~raring) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.

Fine, so we’ll try installing that instead then.

root@test-mgmt0:/# apt-get install mariadb-galera-server-5.5
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 mariadb-galera-server-5.5 : Depends: mariadb-client-5.5 (>= 
5.5.33a+maria-1~raring) but it is not going to be installed
                             Depends: libmariadbclient18 (>= 
5.5.33a+maria-1~raring) but it is not going to be installed
                             PreDepends: mariadb-common but it is not going 
to be installed
E: Unable to correct problems, you have held broken packages.

Okay, closer, so we need to install those too. But hang on, isn’t that apt‘s responsibility to know this stuff? (which it clearly does).

Also note we don’t get told why it isn’t going to be installed. It refuses to install the packages, “just because”. No reason given.

We try adding in the deps to our list.

root@test-mgmt0:/# apt-get install mariadb-galera-server-5.5 
mariadb-client-5.5
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 mariadb-client-5.5 : Depends: libdbd-mysql-perl (>= 1.2202) but it is not 
going to be installed
                      Depends: mariadb-common but it is not going to be 
installed
                      Depends: libmariadbclient18 (>= 5.5.33a+maria-1~raring) 
but it is not going to be installed
                      Depends: mariadb-client-core-5.5 (>= 
5.5.33a+maria-1~raring) but it is not going to be installed
 mariadb-galera-server-5.5 : Depends: libmariadbclient18 (>= 
5.5.33a+maria-1~raring) but it is not going to be installed
                             PreDepends: mariadb-common but it is not going 
to be installed
E: Unable to correct problems, you have held broken packages.

Okay, some more deps, we’ll add those…

root@test-mgmt0:/# apt-get install mariadb-galera-server-5.5 
mariadb-client-5.5 libmariadbclient18
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 libmariadbclient18 : Depends: mariadb-common but it is not going to be 
installed
                      Depends: libmysqlclient18 (= 5.5.33a+maria-1~raring) 
but it is not going to be installed
 mariadb-client-5.5 : Depends: libdbd-mysql-perl (>= 1.2202) but it is not 
going to be installed
                      Depends: mariadb-common but it is not going to be 
installed
                      Depends: mariadb-client-core-5.5 (>= 
5.5.33a+maria-1~raring) but it is not going to be installed
 mariadb-galera-server-5.5 : PreDepends: mariadb-common but it is not going 
to be installed
E: Unable to correct problems, you have held broken packages.

Wash-rinse-repeat!

root@test-mgmt0:/# apt-get install mariadb-galera-server-5.5 
mariadb-client-5.5 libmariadbclient18 mariadb-common
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 libmariadbclient18 : Depends: libmysqlclient18 (= 5.5.33a+maria-1~raring) 
but it is not going to be installed
 mariadb-client-5.5 : Depends: libdbd-mysql-perl (>= 1.2202) but it is not 
going to be installed
 mariadb-common : Depends: mysql-common but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
root@test-mgmt0:/# apt-get install mariadb-galera-server-5.5 
mariadb-client-5.5 libmariadbclient18 mariadb-common libdbd-mysql-perl 
mysql-common
Reading package lists... Done
Building dependency tree       
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 libmariadbclient18 : Depends: libmysqlclient18 (= 5.5.33a+maria-1~raring) 
but 5.5.34-0ubuntu0.13.04.1 is to be installed
 mariadb-client-5.5 : Depends: mariadb-client-core-5.5 (>= 
5.5.33a+maria-1~raring) but it is not going to be installed
 mysql-common : Breaks: mysql-client-5.1
                Breaks: mysql-server-core-5.1
E: Unable to correct problems, you have held broken packages.

Aha, so there’s a newer version in the Ubuntu repository that’s overriding ours. Brilliant. Ohh, and there’s a mysql-client binary too, but it won’t tell me what version it’s trying for.

Looking in the repository myself I spot a package named mysql-common_5.5.33a+maria-1~raring_all.deb. That is likely our culprit, so I try version 5.5.33a+maria-1~raring.

root@test-mgmt0:/# apt-get install mariadb-galera-server-5.5 
mariadb-client-5.5 libmariadbclient18 mariadb-common libdbd-mysql-perl 
mysql-common=5.5.33a+maria-1~raring libmysqlclient18=5.5.33a+maria-1~raring 
mariadb-client-core-5.5
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following extra packages will be installed:
  galera libaio1 libdbi-perl libhtml-template-perl libnet-daemon-perl 
libplrpc-perl

Bingo!

So, for those wanting to pre-seed MariaDB Cluster 5.5, I used the following in my preseed file:

# MariaDB 5.5 repository list - created 2013-11-12 05:20 UTC
# http://mariadb.org/mariadb/repositories/
d-i apt-setup/local3/repository string \
        deb http://mirror.aarnet.edu.au/pub/MariaDB/repo/5.5/ubuntu raring main
d-i apt-setup/local3/comment string \
        "MariaDB repository"
d-i pkgsel/include string mariadb-galera-server-5.5 \
        mariadb-client-5.5 libmariadbclient18 mariadb-common \
        libdbd-mysql-perl mysql-common=5.5.33a+maria-1~raring \
        libmysqlclient18=5.5.33a+maria-1~raring mariadb-client-core-5.5 \
        galera

# For unattended installation, we set the password here
mysql-server mysql-server/root_password select DatabaseRootPassword
mysql-server mysql-server/root_password_again select DatabaseRootPassword

So yeah, next time someone mentions this:

Gentoo: Increasing blood pressure since 1999.

it doesn’t just apply to Gentoo!

GL4Ever FlyTouch III Kernel Sources

Well, some might remember my time with a cheap and nasty Android tablet (some might call these “landfill Android”).  The device packaging did not once even acknowledge the fact that there was GPL’ed software onboard, let alone how one obtains the source.

I discovered it was based around the Vimicro VC0882 SoC.  Turns out, that’s the same as the ViewSonic ViewPad 10e, who do release their kernel sources on their knowledge base.

Thank-you ViewSonic, you have just helped me greatly!  Maybe I should track down one of your tablets and buy one in appreciation.

OpenStack: My foray into cloud computing

I’ve been working with VRT Systems for a few years now. Originally brought in as a software engineer, my role shifted to include network administration duties.

This of course does not phase me, I’ve done network administration work before for charities. There are some small differences, for example, back then it was a single do-everything box running Gentoo hosting a Samba-based NT domain for about 5 Windows XP workstations, now it’s about 20 Windows 7 workstations, a Samba-based NT domain backed by LDAP, and a number of servers.

Part of this has been to move our aging infrastructure to a more modern “private cloud” infrastructure. In the following series, I plan to detail my notes on what I’ve learned through this process, so that others may benefit from my insight.  At this stage, I don’t have all the answers, and there are some things  I may have wrong below.

Planning

The first stage with any such network development (this goes for “cloud”-like and traditional structures) is to consider how we want the network to operate, how it is going to be managed, and what skills we need.

Both my manager and I are Unix-oriented people, in my case I’ll be honest — I have a definite bias towards open source, and I’ll try to assess a solution on technical merit rather than via glossy brochures.

After looking at some commercial solutions, my manager more or less came to the conclusion that a lot of these highly expensive servers are not so magical, they are fundamentally just standard desktops in a small form factor. While we could buy a whole heap of 1U high rack servers, we might be better served by using more standard hardware.

The plan is to build a cluster of standard boxes, in as small form factor as practical, which would be managed at a higher level for load balancing and redundancy.

Hardware: first attempt

One key factor we wanted to reduce was the power consumption. Our existing rack of hardware chews about 1.5kW of power. Since we want to run a lot of virtual machines, we want to make them as efficient as possible. We wanted a small building block that would handle a small handful of VMs, and storing data across multiple nodes for redundancy.

After some research, we wound up with our first attempt at a compute node:

Motherboard: Intel DQ77KB Mini ITX
CPU: Intel Core i3-3220T 2.8GHz Dual-Core
RAM: 8GB SODIMM
Storage: Intel 520S 240GB SSD
Networking: Onboard dual gigabit for cluster, PCIe Realtek RTL8168 adaptor for client-facing network

The plan, is that we’d have many of these, they would pool their storage in a redundant fashion.  The two on-board NICs would be bonded together using LACP and would form a back-end storage network for the nodes to share data.  The one PCIe card would be the “public” face of the cluster and would connect it to the outside world using VLANs.

For the OS, we threw on Ubuntu 12.04 LTS AMD64, and we ran the KVM hypervisor. We then tried throwing this on one of our power meters to see how much power the thing drew. At first my manager asked if the thing was even turned on … it was idling at 10W.

Loaded it on with a few virtual machines, eventually I had 6 VMs going on the thing, ranging from Linux, Windows 2000, Windows XP and a Windows 2008R2 P2V image for one of our customer projects.

The CPU load sat at about 6.0, and the power consumption did not budge above 30W. Our existing boxes drew 300W, so theoretically we could run 10 of these for just one of our old servers.

Management software

Running QEMU VMs from bash scripts is all very well, but in this case we need to be able to give non-technical users use of a subset of the cluster for projects.  I hardly expect them to write bash scripts to fire up KVM over SSH.

We considered a few options: Ganeti, OpenNebula and OpenStack.

Ganeti looked good but the lack of a template system and media library let it down for us, and OpenNebula proved a bit fiddly as well.  OpenStack is a big behemoth and will take quite a bit of research however.

Storage

One factor that stood out like a sore thumb: our initial infrastructure was going to just have all compute nodes, with shared storage between them.  There were a couple of options for doing this such as having the nodes in pairs with DR:BD, using Ceph or Sheepdog, etc… but by far, the most common approach was to have a storage backend on a SAN.

SANs get very expensive very quickly.  Nice hardware, but overkill and over budget.  We figured we should plan for that eventuality, should the need arise, but it’d be a later addition.  We don’t need blistering speed, if we can sustain 160Mbps throughput, that’d probably be fine for most things.

Reading the literature, Ceph looked by far and above the best choice, but it had a catch — you can’t run Ceph server daemons, and Ceph in-kernel clients, on the same host.  Doing so you run the risk of a deadlock, in much the same manner as NFS does when you mount from localhost.

OpenStack actually has 3 types of storage:

  • Ephemeral storage
  • Block storage
  • Image storage

Ephemeral storage is specific to a given virtual machine.  It often lives on the compute node with the VM, or on a back-end storage system, and stores data temporarily for the life of a virtual machine instance.  When a VM instance is created, new copies of ephemeral block devices are created from images stored in image storage.  Once the virtual machine is terminated, these ephemeral block devices are deleted.

Block storage is the persistent storage for a given VM.  Say you were running a mail server … your OS and configuration might exist on a ephemeral device, but your mail would sit on a block device.

Image storage are simply raw images of block devices.  Image storage cannot be mounted as a block device directly, but rather, the storage area is used as a repository which is read from when creating the other two types of storage.

Ephemeral storage in OpenStack is managed by the compute node itself, often using LVM on a local block device.  There is no redundancy as it’s considered to be temporary data only.

For block storage, OpenStack provides a service called cinder.  This, at its heart, seems to use LVM as well, and exports the block devices over iSCSI.

For image storage, OpenStack has a redundant storage system called swift.  The basis for this seems to be rsync, with a service called swift-proxy providing a REST-interface over http.  swift-proxy is very network intensive, and benefits from hardware such as high-speed networking (e.g. 10Gbps Ethernet).

Hardware: second attempt

Having researched how storage works in OpenStack somewhat, it became clear that one single building block would not do.  There would in fact be two other types of node: storage nodes, and management nodes.

The storage nodes would contain largish spinning disks, with software maintaining copies and load balancing between all nodes.

The management nodes would contain the high-speed networking, and would provide services such as Ceph monitors (if we use Ceph), swift-proxy and other core functions.  RabbitMQ and the core database would run here for example.

Without the need for big storage, the compute nodes could be downsized in disk, and expanded in RAM.  So we now had a network that looked like this:

Node Type Compute Management Storage
Motherboard: Intel DQ77KB Mini ITX Intel DQ77MH Micro ATX
CPU: Intel Core i3-3220T 2.8GHz Dual-Core
RAM: 2*8GB SODIMM 2*4GB DIMM
Storage: Intel 520S 60GB SSD Intel 520S 60GB SSD for OS, 2*Seagate ST3000VX000-1CU1 3TB HDDs for data
Networking: Onboard dual gigabit for cluster, PCIe Realtek RTL8168 adaptor for client-facing network Onboard dual gigabit for management, PCIe 10GbE for cluster communications Onboard dual gigabit for cluster, PCIe Realtek RTL8168 adaptor for management

The management and storage nodes are slightly tweaked versions of what we use for compute nodes. The motherboard is basically the same chipset, but capable of taking larger PCIe cards and using a standard ATX power supply.

Since we’re not storing much on the compute nodes, we’ve gone for 60GB SSDs rather than 240GB SSDs to cut the cost down a little. We might have to look at 120GB SSDs in newer nodes, or maybe look at other options, as Intel seem to have discontinued the 60GB 520S … bless them! The Intel 520S SSDs were chosen due to the 5-year warranty offered.

The management and storage nodes, rather than going into small Mini-ITX media-centre style cases, are put in larger 2U rackmount cases. These cases have room for 4 HDDs, in theory.

Deployment

For testing purposes, we got two of each node. This allows us to try out things like testing what would happen if a node went belly up by yanking its power, and to test load balancing when things are working properly.

We haven’t bought the 10GbE cards at this stage, as we’re not sure exactly which ones to get (we have a Cisco SG500X switch to plug them into) and they’re expensive.

The final cluster will have at least 3 storage nodes, 3 management nodes and maybe as many as 16 compute nodes. I say at least 3 storage nodes — in buying the test hardware, I accidentally ordered 7 cases, and so we might decide to build an extra storage node.

Each of those gives us 6TB of storage, and the production plan is to load balance with a replica on at least 3 nodes… so we can survive any two going belly up. The disks also push close to 800Mbps throughput, so with 3 nodes serving up data, that should be enough to saturate the dual-gigabit link on the compute node. 4 nodes would give us 8TB of effective storage.

With so many nodes though, one problem remains, deploying the configuration and managing it all. We’re using Ubuntu as our base platform, and so it makes sense to tap into their technologies for deployment.

We’ll be looking to use Ubuntu Cloud and Juju to manage the deployment.

Ubuntu Cloud itself is a packaged version of OpenStack.  The components of OpenStack are deployed with Juju.  Juju itself can deploy services either to “public clouds” like Amazon AWS, or to one’s own private cluster using Ubuntu MAAS (Metal As A Service).

Metal As a Service itself basically is a deployment system which installs and configures Ubuntu on network-booting clients for automatic installation and configuration.

The underlying technology is based on a few components: dnsmasq DHCP/DNS server, tftp-hpa TFTP server, and the configuration gets served up to the installer via a web service API.  There’s a web interface for managing it all.  Once installed, you then deploy services using Juju (the word juju apparently translates to “magic”).

Further research

So having researched what hardware will likely be needed, I need to research a few things.

Firstly, the storage mechanism, we can either go with the pure OpenStack approach with cinder managing LVM based storage and exporting over iSCSI, or we get cinder to manage a Ceph back-end storage cluster.  This decision has not yet been made.  My two biggest concerns with cinder are:

  • Does cinder manage multiple replicas of block storage?
  • Does cinder try to load-balance between replicas?

With image storage, if we use Ceph, we have two choices.  We can either:

  • Install Swift on the storage nodes, partition the drives and use some of the storage for Swift, and the rest for Ceph… with Swift-proxy on the management nodes.
  • Install Rados Gateway on the management nodes in place of Swift

But which is the better approach?  My understanding is that Ceph doesn’t fully integrate into the OpenStack identity service (called keystone).  I need to find out if this matters much, or whether splitting storage between Swift and Ceph might be better.

Metal As a Service seems great in concept.  I’ve been researching OpenStack and Ceph for a few months now (with numerous interruptions), and I’m starting to get a picture as to how it all fits together.  Now the next step is to understand MAAS and Juju.  I don’t mind magic in entertainment, but I do not like it in my systems.  So my first step will be to get to understand MAAS and Juju on a low level.

Crucially, I want to figure out how one customises the image provided by MAAS… in particular, making sure it deploys to the 60GB SSD on each node, and not just the first block device it sees.

The storage nodes have their two 6Gbps SATA ports connected to the 3TB HDDs for performance, making them visible as /dev/sda and /dev/sdb — MAAS needs to understand that the disk it needs to deploy to is called /dev/sdc in this case.  I’d also perfer it to use XFS rather than EXT4, and a user called something other than “ubuntu”.  These are things I’d like to work out how to configure.

As for Juju, I need to work out exactly what it does when it “bootstraps” itself.  When I tried it last, it randomly picked a compute node.  I’d be happier if it deployed itself to the management node I ran it from.  I also need to figure out how it picks out nodes and deploys the application.  My quick testing with it had me asking it to deploy all the OpenStack components, only to have it sit there doing nothing… so clearly I missed something in the docs.  How is it supposed to work?  I’ll need to find out.  It certainly isn’t this simple.