- chmod 777
- chmod 4755 $file
- setenforce 0
- echo ” |passwd –stdin root
- service iptables stop
- echo ‘reboot’ > /etc/cron.daily/fix-hanging-db.sh
- curl http://randomwebsite/foo.sh | bash
The last one bugs the crap out of me when good software developers assume this is a valid way to install software (outside of your personal machine).
I struggled with this for a few days before figuring it out, so I’ll post it here in hopes it saves someone a few minutes. When you install puppet and start the puppetmaster (webrick or rack-enabled) it generates a ssl cert for that machine and also generates a CA that you will use to sign all of your clients.
Recent versions of puppet do not add subjectAltNames to the server certificate when it’s generated by the puppetmaster process. This means that if you do not use the same name as your masters hostname to connect to puppet you will get a lovely cert mismatch. I posted a question on serverfault about this (here). It looks like the common practice for EC2 in particular is to use a uuid as the certname for each puppet client. This avoids name collisions and problems with hostnames changing everytime the instance is rebooted. It’s a little harder to keep track of since they aren’t very easy to remember, so caveat emptor.
One of the more hotly debated topics among sysadmins is what to name servers. Some people use this as an outlet for their creativity or pop culture references. Servers named after Lord of the Rings characters, super heros, greek mythology abound. There’s a strong push from those of us who have moved past the ‘clever’ phase of our careers to name machines in logical consistent manners. web0X, db0Y, rackXpduY, all get bandied around and are debated with often the same fervor as Vim vs Emacs (vim for the record). The truth, sadly is that all the good names are taken ™©®. The grizzled veterans who’ve done time on a VAX will exclaim, “This naming scheme is crap, lets just use IPs, they are immutable.” Well for one they are wrong, ips are not immutable. Take a look at EC2, have fun with that. Second humans are bad at remembering numbers, 10 digits is the longest number most people can retain (why phone numbers are that length) and usually not for very long.
Names also provide a very important psychological edge for our poor meat brains. Names allow us to recall information in a similar way that a key allows you to recall information from a database. A message from your alert system saying ‘Alert gandalf.example.com is DOWN!’ would (in theory) trigger something in your memory. Gandalf is a wizard, that’s the master DNS server! This key isn’t as good as a more meaningful name but it’s a key none the less. I prefer names which are functional and overload information into the rest of the domain. proxy01.atl.example.com tells me very quickly this is a proxy server, it’s one of a multinode cluster, likely load balanced, and is located in Atlanta. All of this allows me to asses the situation at hand faster. All of the pertinent details should be written down in a wiki, or some other document source, but the naming gives me a fast way to access that without having to go look it up. 126.96.36.199 is DOWN only tells me something is broken, not how important, how impacting or anything about it. Maybe that’s a dev box or 1 node in a 40 node cluster, but I don’t know that (unless I just memorize it which stresses the meat brain) until I look it up.
Consistency is the key, I don’t in general like ‘clever’ names not because they are unprofessional or silly, but because they only mean something to the person who came up with it. I know why I named the database ‘pearl’ (bonus points to anyone that guesses), but my other team members might not and likely that someone coming behind me wouldn’t either. I’m a huge fan of code names and clever names for software / service names / etc just not machine names. Here are some of the conventions I use.
Multinode clusters are numbered 2 digit starting at 01.
10 servers in a web cluster, web01 – web10. Using 2 digit precision gives you 99 machines before you end up changing field sizes.
Short hostnames are the most common functional purpose.
Sometimes it’s ok to call it a server and put more information into the sub domain. ‘Web’ in general sucks, it’s too generic and means very little, what does it ‘do’.
If you think they’ll be more than one, name it 01.
Don’t use a sequential numbering system for unrelated things.
If you have two webservers that serve different content/services/etc don’t name them web01 / web02. This creates a logical grouping of those two machines which are not actually tied together from a service standpoint. I’ve heard of shared filesystems being named fs01, fs02, fs03, fs04, etc. They aren’t related other than that they are all shared filesystems, why are you grouping them into something that looks like a cluster. People assume that 1 is related to 2 to 3 to 4. Put some thought into it and give it a name based on what it does or what’s important about it.
Use A / B notation for duality relationships.
I name my netapp filers: filer01a / filer01b. They are both addressable services but provide failover for each other. There will never be a ‘c’ since netapp doesn’t support wheel based failover. They are a matched set, so they are named as such. A vs B gives less cardinality than 1 vs 2 and that’s a good thing.
Use subdomains in a consistent manner to produce a lightweight hierarchy of information.
proxy01.www.internal.nyc.example.com lets me denote physical location, security context (internal), content type (www), and functional purpose (proxy) all in one name. Granted this assumes a high degree of machine / service separation and may not work for everything, but you can use that name to store quickly accessible information.
Order is important, remember that.
In english we read left to right. Information is ordered in that direction as well. Put the thing you care about most (or quickest) to the left and less immediate information flows to the right). database01.hr.alt tells me it’s a database (important!), it’s part of a cluster (less important than being a db but still relevant), HR database (eeek will I get paid?!), and finally location which may not matter (alt is a backup site, I can deal with that later). Order frames your response into the correct context. database.atl.hr.clusternode1 tells me this machine is a database (important), in Atlanta (wait that’s the dr site I might not care right away), it’s HR (wait we don’t have a primary b/c it died last week), and that it’s a clusternode. Is this better or worse? Depends on the context, order is important.
The crux of the whole point is that names are useful things, humans name things not because they want to be clever but because it’s an effective way to partition information about something without having to memorize it all. It comes down to the difference between knowing something and memorizing it. You design a convention and stick to that convention until it doesn’t work, then you redefine that convention. The convention saves you time but only if everyone ‘gets’ the convention or it can be easily explained. If your convention is a complicated scheme involving lollipop guild chairmen’s you are requiring the audience to have immediate intrinsic knowledge of Mid 1930’s Judy Garland films, which is the same as asking them to look it up.
As a long time perl guy I was attracted by ruby. It’s very perl like method chaining is extremely useful and intuitive. I like ruby, but the state of ruby applications in a production environment is horrible. There are plenty of really good tools out there for ruby developers. Vagrant, sahara, bundler, capistrano, etc the list goes on. RVM and rbenv are two really good alternatives for maintaining your development environment in a sane manner. We are in the stone ages when it comes time to go to production. Distro support for ruby is shaky at best. Most places are still running centos/rhel5 which leaves them with ruby 1.8.5 or if lucky ruby 1.8.7. If you’ve upgraded to a rhel6-ish you’re fortunate enough to get wait… ruby 1.8.7. I’m not as familiar with debian but I’m fairly certain it’s 1.8. Ubuntu has an available 1.9.1 package but that’s officially a beta version, plus most indications is that it’s extremely buggy. As of writing ruby is on 1.9.3. When it comes to rubygems the situation is even worse. Most ‘best practices’ recommend managing everything with gems. This introduces a world of pain especially when you start installing things that are based on ruby but provide shell level commands (rackup, unicorn, etc). Now you’ve got two package managers trying to determine the state of a system, but one of them only knows about one part of it. It’s a mess. Read more…
I’ve been using vi/vim for nearly my entire professional life, and most of my computer life as well. I gave emacs an honest try for a couple of days a few years ago but just couldn’t grok the shortcuts and make it feel natural. Recently I overhauled my setup on my laptop and in specific tuned to to what I generally spend a lot of development time on… puppet.
I’m on the hunt for a good password manager for the iPhone. But there’s a slight catch. I’m looking for something that works with fedora. I’d like to be able to sync it locally as well. There seem to be couple of things that will sync “to the cloud” but that seems to be a horrible idea for passwords.
Anyone have suggestions?
We are currently going through an ITIL implementation. It’s had it’s ups and downs and philosophically I don’t really believe in it (certainly not in our implementation), but it’s had a few successes and a few failures. Without droning too much about it, to make any ‘production’ change you have to file an RFC that gets reviewed by a management team. There is a relatively recent DNS attack that involves using root zone recursion to DOS a target server. We’re vulnerable to being used in this manner. It really doesn’t affect us much as that our servers handle the requests fine, but we’re assisting in a DDOS and that’s not good. For us the fix is pretty straight forward, because of some historical decisions we have to allow recursion for certain ips, so I need to segment things off into a tighter view and eliminate recursion there. This is a pretty straight forward change and one that I would do without a second thought (after testing). Due to our current climate of process I have to file an RFC, which is fine, I’m not real happy about it but I’ll live.
However my RFC was denied not because of any technical reason, not because of any concern over the technology, the implementation, or the timing. It was denied because I didn’t put the correct information into the details page and because my dates were wrong. I’m all for doing process right (when it makes sense), but does it make sense to derail a security fix for 4 days because the form was incorrect? Especially when there exists a forum in which you can be asked to clarify anything regarding your RFC.
Now when security takes a backseat to process, your organization has truly begun the decent to failure. This may indeed be the straw…
I’m trying to make a concerted effort to first of all blog a little more, and secondly blog more about sysadmin type stuff. Hopefully that’ll give me a little bit more direction.
We’ve been a big NAS shop for a number of years, actually well before I come on board. We are starting to use SAN more and more nowadays. We have a much more stable SAN fabric (the network side of fiber channel storage for those of you keeping score at home). So I spend several days before the break fighting with various SAN issues. Most of them were my lack of particular experience with our SAN implementation as well as host level tools. The pain of SAN comes largely from the host end. Your SAN device (even in our case with NetAPP) is probably pretty good at doing it’s end and is well documented. But on the linux side SAN is very vendor specific, which always leads to problems. For example if you are using an EMC you have to get supported HBAs then in some cases run a custom kernel to support that HBA and then you probably end up needed vendor specific tools for handling things. In my setup I don’t need a custom kernel, but we do have to support a small vendor package of tools. NetApp is actually pretty good when it comes linux supoprt, they package RPMs in most cases and stay current with versions as far as support.
Recently I just got back from LISA (Large Install System Administrators) Conference in San Diego. Overall I really enjoy this conference. My employer generally doesn’t spend very much on conferences, at least not for people in my position, so it’s nice that I get to go to this one. There are very few Sysadmin specific conferences out there. Velocity seems to have some potential despite it being very Web (2.0) centric. I haven’t been to Velocity so I really can’t comment.