- chmod 777
- chmod 4755 $file
- setenforce 0
- echo ” |passwd –stdin root
- service iptables stop
- echo ‘reboot’ > /etc/cron.daily/fix-hanging-db.sh
- curl http://randomwebsite/foo.sh | bash
The last one bugs the crap out of me when good software developers assume this is a valid way to install software (outside of your personal machine).
I struggled with this for a few days before figuring it out, so I’ll post it here in hopes it saves someone a few minutes. When you install puppet and start the puppetmaster (webrick or rack-enabled) it generates a ssl cert for that machine and also generates a CA that you will use to sign all of your clients.
Recent versions of puppet do not add subjectAltNames to the server certificate when it’s generated by the puppetmaster process. This means that if you do not use the same name as your masters hostname to connect to puppet you will get a lovely cert mismatch. I posted a question on serverfault about this (here). It looks like the common practice for EC2 in particular is to use a uuid as the certname for each puppet client. This avoids name collisions and problems with hostnames changing everytime the instance is rebooted. It’s a little harder to keep track of since they aren’t very easy to remember, so caveat emptor.
One of the more hotly debated topics among sysadmins is what to name servers. Some people use this as an outlet for their creativity or pop culture references. Servers named after Lord of the Rings characters, super heros, greek mythology abound. There’s a strong push from those of us who have moved past the ‘clever’ phase of our careers to name machines in logical consistent manners. web0X, db0Y, rackXpduY, all get bandied around and are debated with often the same fervor as Vim vs Emacs (vim for the record). The truth, sadly is that all the good names are taken ™©®. The grizzled veterans who’ve done time on a VAX will exclaim, “This naming scheme is crap, lets just use IPs, they are immutable.” Well for one they are wrong, ips are not immutable. Take a look at EC2, have fun with that. Second humans are bad at remembering numbers, 10 digits is the longest number most people can retain (why phone numbers are that length) and usually not for very long.
Names also provide a very important psychological edge for our poor meat brains. Names allow us to recall information in a similar way that a key allows you to recall information from a database. A message from your alert system saying ‘Alert gandalf.example.com is DOWN!’ would (in theory) trigger something in your memory. Gandalf is a wizard, that’s the master DNS server! This key isn’t as good as a more meaningful name but it’s a key none the less. I prefer names which are functional and overload information into the rest of the domain. proxy01.atl.example.com tells me very quickly this is a proxy server, it’s one of a multinode cluster, likely load balanced, and is located in Atlanta. All of this allows me to asses the situation at hand faster. All of the pertinent details should be written down in a wiki, or some other document source, but the naming gives me a fast way to access that without having to go look it up. 22.214.171.124 is DOWN only tells me something is broken, not how important, how impacting or anything about it. Maybe that’s a dev box or 1 node in a 40 node cluster, but I don’t know that (unless I just memorize it which stresses the meat brain) until I look it up.
Consistency is the key, I don’t in general like ‘clever’ names not because they are unprofessional or silly, but because they only mean something to the person who came up with it. I know why I named the database ‘pearl’ (bonus points to anyone that guesses), but my other team members might not and likely that someone coming behind me wouldn’t either. I’m a huge fan of code names and clever names for software / service names / etc just not machine names. Here are some of the conventions I use.
Multinode clusters are numbered 2 digit starting at 01.
10 servers in a web cluster, web01 – web10. Using 2 digit precision gives you 99 machines before you end up changing field sizes.
Short hostnames are the most common functional purpose.
Sometimes it’s ok to call it a server and put more information into the sub domain. ‘Web’ in general sucks, it’s too generic and means very little, what does it ‘do’.
If you think they’ll be more than one, name it 01.
Don’t use a sequential numbering system for unrelated things.
If you have two webservers that serve different content/services/etc don’t name them web01 / web02. This creates a logical grouping of those two machines which are not actually tied together from a service standpoint. I’ve heard of shared filesystems being named fs01, fs02, fs03, fs04, etc. They aren’t related other than that they are all shared filesystems, why are you grouping them into something that looks like a cluster. People assume that 1 is related to 2 to 3 to 4. Put some thought into it and give it a name based on what it does or what’s important about it.
Use A / B notation for duality relationships.
I name my netapp filers: filer01a / filer01b. They are both addressable services but provide failover for each other. There will never be a ‘c’ since netapp doesn’t support wheel based failover. They are a matched set, so they are named as such. A vs B gives less cardinality than 1 vs 2 and that’s a good thing.
Use subdomains in a consistent manner to produce a lightweight hierarchy of information.
proxy01.www.internal.nyc.example.com lets me denote physical location, security context (internal), content type (www), and functional purpose (proxy) all in one name. Granted this assumes a high degree of machine / service separation and may not work for everything, but you can use that name to store quickly accessible information.
Order is important, remember that.
In english we read left to right. Information is ordered in that direction as well. Put the thing you care about most (or quickest) to the left and less immediate information flows to the right). database01.hr.alt tells me it’s a database (important!), it’s part of a cluster (less important than being a db but still relevant), HR database (eeek will I get paid?!), and finally location which may not matter (alt is a backup site, I can deal with that later). Order frames your response into the correct context. database.atl.hr.clusternode1 tells me this machine is a database (important), in Atlanta (wait that’s the dr site I might not care right away), it’s HR (wait we don’t have a primary b/c it died last week), and that it’s a clusternode. Is this better or worse? Depends on the context, order is important.
The crux of the whole point is that names are useful things, humans name things not because they want to be clever but because it’s an effective way to partition information about something without having to memorize it all. It comes down to the difference between knowing something and memorizing it. You design a convention and stick to that convention until it doesn’t work, then you redefine that convention. The convention saves you time but only if everyone ‘gets’ the convention or it can be easily explained. If your convention is a complicated scheme involving lollipop guild chairmen’s you are requiring the audience to have immediate intrinsic knowledge of Mid 1930’s Judy Garland films, which is the same as asking them to look it up.
I’m trying to make a concerted effort to first of all blog a little more, and secondly blog more about sysadmin type stuff. Hopefully that’ll give me a little bit more direction.
We’ve been a big NAS shop for a number of years, actually well before I come on board. We are starting to use SAN more and more nowadays. We have a much more stable SAN fabric (the network side of fiber channel storage for those of you keeping score at home). So I spend several days before the break fighting with various SAN issues. Most of them were my lack of particular experience with our SAN implementation as well as host level tools. The pain of SAN comes largely from the host end. Your SAN device (even in our case with NetAPP) is probably pretty good at doing it’s end and is well documented. But on the linux side SAN is very vendor specific, which always leads to problems. For example if you are using an EMC you have to get supported HBAs then in some cases run a custom kernel to support that HBA and then you probably end up needed vendor specific tools for handling things. In my setup I don’t need a custom kernel, but we do have to support a small vendor package of tools. NetApp is actually pretty good when it comes linux supoprt, they package RPMs in most cases and stay current with versions as far as support.
Recently I just got back from LISA (Large Install System Administrators) Conference in San Diego. Overall I really enjoy this conference. My employer generally doesn’t spend very much on conferences, at least not for people in my position, so it’s nice that I get to go to this one. There are very few Sysadmin specific conferences out there. Velocity seems to have some potential despite it being very Web (2.0) centric. I haven’t been to Velocity so I really can’t comment.
Whenever I discuss configuration management with anyone that is new to the concept, and even some people that have been doing it for a while. There’s one concept that comes up that I have to argue with people about incessantly. It’s this concept of concatenation. Basically what people want to do is have this stub of a file be global, this other stub only effect this particular subset of machines, this other stub affect this other subset, then finally a stub that’s host specific. Read more…