The Guru College
Cider Update
I have racked the cider from the primary to my secondary fermenter, and everything went well. I took a little cider out to sample, and while it’s a little harsher than the first batch due to it’s relative youth, it’s a bit sweeter, and already slightly carbonated. I’m going to have to be careful about how and when I bottle – I think I’m going to let it sit with it’s airlock in the secondary for at least three weeks before I bottle.
I did have one major screw-up, though. After I’d racked the cider, I went to clean the Better Bottle I used as a primary vessel – and I forgot the water in the pot was very hot. Not boiling, but hot. And I poured about a gallon of water into the bottle. It instantly started to deform. Doing some looking, I’ve not destroyed a $30 Better Bottle, but it’s going to look ugly for the rest of time. I’m just going to chalk it up to another lesson learned in my journey.
Relax! You Are Behind Firewalls
I’ve noticed something interesting and stupid about coffee shop firewalls. They are configured to disallow a lot of handy IPv4 traffic, like SSH, while still allowing through web traffic, DNS, kerberos, OpenAFS, etc. However, they don’t appear to have any blocks in place for IPv6 ssh traffic. How do I know?
`I’ve noticed something interesting and stupid about coffee shop firewalls. They are configured to disallow a lot of handy IPv4 traffic, like SSH, while still allowing through web traffic, DNS, kerberos, OpenAFS, etc. However, they don’t appear to have any blocks in place for IPv6 ssh traffic. How do I know?
`
Handy trick, until they start deploying IPv6 aware firewall devices (or blocking IPv6 totally, which would be a bad, bad thing).
Cider, Round Two
Looks like I’m well past cider-making season, and yet, I put my next batch in to brew on the 3rd of January. It’s a lot easier the second time around, as you know more or less what you are doing, and aren’t stopping to ask Google questions every time you walk back into the room. When do I sanitize? When do I clean? Really, taking the airlock out means cleaning it again? Is Star-San a no-rinse solution? How do I increase alcohol content?
The airlock, at work
So the second time is easier, but in a way, a lot harder: for the first batch, I had gone to Asheville, NC, during apple season, and gotten two gallons of pure, unfiltered, unpasteurized apple cider. $10 a gallon. This time, pure cider is effectively unobtainable. So, I went to the local grocery store and picked up two gallons of “All natural” cider – the only one I could find that only listed Apple Juice as an ingredient. Yes, it’s pasteurized, but it doesn’t have sulfates added for stabilization, it doesn’t have color added, etc. It’s just juice.
So, it’s in the primary. Two weeks or so to go until it’s ready to rack over. I think I’ll try making beer next.
More Thoughts On OpenAFS, Part II
There actually is a significant problem running OpenAFS at home that I am well aware of, [and didn’t mention in my last post on the topic][1]. I bring it up now, as the casual reader of this blog may not be aware of this technical wrinkle: OpenAFS requires a Kerberos domain be setup for use. As I can’t very well use my employer’s Kerberos infrastructure, and have a self sufficient OpenAFS deployment at home, I’d also have to setup Kerberos (and by extension keep up my DNS and NTP) services. It may well be better to go ahead and roll out an OpenLDAP environment, so in the future, we can have multiple logins across the house for the same accounts.
But I get ahead of myself. The problem to solve is storage, however that happens. If other problems come up, and they look interesting enough, and I have the time and the resources to tackle them, great. Otherwise, I’m just here to solve storage (and this does limit the appeal of OpenAFS).
[1]: http://www.gurucollege.net/blog/technology/more-thoughts-on-openafs/ ‎
More Thoughts On OpenAFS
<a href="http://www.gurucollege.net/blog/technology/replicated-shared-nothing-file-systems">In my last post about file systems</a>, I talked about options that were replicated, shared-nothing, and distributed. One of the options that came up was OpenAFS, and while it doesn't yet support replicated read/write volumes, it does have a lot going for it, including a healthy perl module that allows a lot of operations to be done from scripts, and the fact that I administer OpenAFS at work, so I have a passing familiarity with it.</p>
<p>OpenAFS also allows a seamless mounting of logical volumes anywhere inside the filesystem root, and allows for the live migration of volumes between storage nodes with no user downtime - users are still able to write to live volumes while they are being moved. This in turn creates an opportunity for some neat tricks: if one of the nodes is setup with smaller but fast SSD's, frequently accessed volumes could be migrated to the faster nodes seamlessly, and less frequently used volumes could be moved off to slower, cheaper drives.</p>
<p>OpenAFS still doesn't give me one the the features I really want, however: real time, distributed protection of data between nodes. It seems essential at this point that I have the ability to shut down any node at any time and not suffer any outages. This is a tall order, I know, but this is the modern world of technology, damnit.
Replicated, shared-nothing file systems
I’m starting to get very tired of Sun/Oracle and their long term stance on Solaris and ZFS. While I understand they own the technology (which they invented and implemented), I bought into OpenSolaris with the idea that it was… well, Open. It’s feeling less and less open now, especially with the shutdown of the community for all intents and purposes, and the changing of “OpenSolaris” into “Solaris 11 Express”.
Thankfully my file server is rocking along, and I’m backing up everything with CrashPlan, just in case, but it’s time to start looking at a new storage infrastructure. Learning from my current lesson of a single, expensive, monolithic storage server, I think it’s time to expand horizontally with low cost GigE storage nodes, and build a distributed environment from that. The requirements are:
- It has to be truly open
- All data is replicated in real time between nodes
- Upgrades are seamless
- Good diagnostics and self healing
I’m honestly not sure this is possible. I’d also like to move in the direction of Shared Nothing – so I don’t have to invest in expensive quorum devices, or fenced/multi-homed storage. I’d like to be able to build nodes on the integrated Intel Atom boards, so the motherboard, CPU, power supply and RAM would cost about $160/node, leaving most of the rest of the budget for hard drive purchases. I’d also like the ability to boot from the network vie PXE, or failing that, from a USB drive or CF adapter. I do have 8 drives to re-use from my current OpenSolaris servers, but they don’t come out to play until I have at least 3.5 TB of useable disk space where I cam move everything to, and test for a little while.
There is a final requirement – the storage must be natively accessible from Mac OS X Lion. My definition for native in this case is to provide a filesystem that Aperture thinks is local enough to load an Aperture Library from. (It’s very picky, and refuses to use NFS, SMB or even Apple’s own AFP). So, this could be iSCSI, but it could also be MacFUSE, depending on how well that works, or another technology that has tapped into the VFS layer of Mac OS X.
I’m currently looking into GlusterFS, ceph and the filesystem side of Hadoop. They seem the best suited to my needs, in terms of technical architecture, but I’m not sure if they are going to work for my needs. The other contender, which I know will work, but doesn’t yet support replicated read/write volumes is OpenAFS. I use it at work, and I know it will work, but without replicated storage, it’s a non-starter for me.
VMWare Perk SDK and Runtime Issues
If you are using the VMWare Perl SDK, you need to avoid using the find_entity_views
unless you also call it with a property filter, such as properties => ['vm', 'summary']
. This keeps the API from gathering all the available information about each node it traverses. There is a lot of data, and some of it (like performance counters) are generated each time they are asked for. This makes runtime operations very slow. My script to parse VMWare info went from an 8 minute run to less then 20 seconds after properly filtering down the property list.