Archive

Archive for the ‘management’ Category

How To Hire A Sysadmin, Part II

October 27, 2009 padraic2112 3 comments

As promised, albeit a bit later than originally scheduled, the second part of “How To Hire A Sysadmin”, with the Second Question to ask a potential hire:

“You are sitting at your desk one day when the Chief Operations Officer shows up out of nowhere and says, ‘I believe that my assistant Frank has been communicating company secrets to competitors.  We let Frank go this morning.  I want a copy of all Frank’s files and his email put into the Operations share on the file server so that the Ops group can go through it to see what the damage is.’  What do you say?”

I’ll give you a hint, the best answer is a very quick and unequivocal “I’m sorry, Dave, I can’t do that.“  The second (almost as acceptable) reply is, “I’m afraid you’ll have to have that okayed by the office of legal counsel.”  [edited to add]  This actually can go either way in terms of what’s the “best” answer.  I’m getting ahead of myself.

I’ll give you another hint, hiring someone who answers the question correctly may turn out to be a pain in your butt later.  It’s for your own good.

In spite of the fact that (at least in the State of California) your employees’ email accounts are regarded as property of the company, the fact still remains that you (in this case The Company or The Organization or whatever) have legal responsibilities when it comes to employee data.  From Harvard Law (just one example):

The general idea has been that the employer owns the equipment, and can therefore set the terms of its use. Even under current law, which has been deferential to employer monitoring, this does not mean that employers are free to monitor or not monitor at will. It is not clear, for example, whether employers who fail to notify their employees that they monitor their mouseclicks will avoid liability for invasion of privacy. Moreover, even if employers issue a general notice to employees that they may be monitored, an employee might argue that more specific notice is required.

Even if your workplace has very well defined rules regarding employee’s use of electronic equipment, be aware that you can have all kids of serious repercussions for mishandling data.  If Frank’s mother recently emailed Frank that she had Hepatitis C, and that information winds up getting out, you can be up for a world of painola.

Handling data discovery, even internal to the company, is not something that a systems administrator should *ever* regard as his (or her) blanket responsibility.  They shouldn’t be monitoring general computer usage either, but that’s a subject for another post.  Oh, wait, I already wrote that one.

In the United States, there are three classical professions, namely… doctor, lawyer, and priest according to Wikipedia.  It’s generally given now that this list is somewhat longer, the Department of Labor has a whole list of “professional and related occupations“.

Generally, to be officially regarded as a “profession” (as opposed to an “occupation”) you need to meet a few requirements, one of which is some sort of professional body.  Borrowing straight from the Wikipedia page:

Professions are typically regulated by statute, with the responsibilities of enforcement delegated to respective professional bodies, whose function is to define, promote, oversee, support and regulate the affairs of its members. These bodies are responsible for the licensure of professionals, and may additionally set examinations of competence and enforce adherence to an ethical code of practice. However, they all require that the individual hold at least a first professional degree before licensure. There may be several such bodies for one profession in a single country, an example being the ten accountancy bodies (ACCA, ICAEW, ICAI, ICAS, CIMA, CIPFA, AAPA, CIMA, IFA, CPA) of the United Kingdom, all of which have been given a Royal Charter although not necessarily considered to hold equivalent-level qualifications.

Typically, individuals are required by law to be qualified by a local professional body before they are permitted to practice in that profession. However, in some countries, individuals may not be required by law to be qualified by such a professional body in order to practice, as is the case for accountancy in the United Kingdom (except for auditing and insolvency work which legally require qualification by a professional body). In such cases, qualification by the professional bodies is effectively still considered a prerequisite to practice as most employers and clients stipulate that the individual hold such qualifications before hiring their services.

There is no nationally-recognized “systems administrator” professional body (there ought to be), and yet we have administrative or root access to the file servers upon which sit not just your corporate data, financials, customer lists, etc., but also *your* email and *your* documents (that’s one major reason why there ought to be).  If you are a doctor, a lawyer, an engineer, or an accountant you may actually *lose* the ability to do your job in perpetuity if you fail to exercise your professional responsibilities in the eyes of your professional body; you can lose your license to practice law, or medicine, or submit plans to City Hall, etc.  This is a double-edged sword: if you mess up, you can literally be ejected from the profession.  However, with it comes a protection that currently systems administrators don’t have: if The Boss tells you to do something that violates your professional ethics, you can tell The Boss, “I’m sorry Dave, I can’t do that, because I could lose my license.”  Right now, sysadmins have the responsibility to protect the data, but we don’t have much in the way of business clout.  You want to hire someone who will actively refuse to let you shoot yourself in the foot, even without that backing.

Trust me.

Categories: management, tech

How To Hire A Sysadmin, Part I

October 19, 2009 padraic2112 5 comments

There’s lots of lists out there for “interview questions” to ask IT people when you are interviewing them for a new position.  Many of those lists are pretty worthless in practice, as they actually ask the sorts of questions to which you can find the answers with 60 seconds and a web browser, but they don’t ask the sort of questions that actually tell you anything about the candidate’s capability to understand complex system design.

I really don’t need to know if you’ve memorized the IPv4 header (this is the networking equivalent of memorizing Pi to 40 digits).  I don’t really need to know if you know the difference between the HKEY_CURRENT_CONFIG and HKEY_LOCAL_MACHINE registry hives on a Windows machine, or what the difference is between GRUB and LILO, or what your opinion is of the advantage of the FreeBSD ports collection vs. Linux’s RPMs.  I *really* don’t need to see your Perl coding skills, because if you’re a really good Perl coder you should be writing code, not administering systems.  Not to mention the fact that if you’re administering a lot of systems with home-grown Perl code I probably don’t want to hire you because after 6 months the only person who will have a freaking clue about how the cluster works is the guy who wrote all the tools from scratch in Perl.

What I need to know is if you understand, at a meta-level, what a sysadmin is supposed to do.  You can learn syntax over time (or ask the magic Internet machine).  Learning how to juggle interdependencies is something else.  In fact, quite often those people who are really skilled at syntax (read: recent certification acquisitions) can sound like they really know what they’re doing, without knowing anything about what they *ought* to be doing.

So, I have only two questions for a sysadmin candidate.  Here’s the first one:

“You have a cluster of 300 machines, running 40 different services, on three discrete networks, with two OS-level dependencies. Assuming you’ve built this cluster yourself from scratch with no legacy dependencies, describe this cluster. Feel free to ask as many questions as you like for clarification. Go.”

This is meta-level information mining.  A good sysadmin will spend more time asking questions about what the cluster is supposed to be doing, what sort of services are running, what’s the uptime requirements, who the users are, and what the business continuity requirements look like than they will talking about their design ideas.  A good sysadmin will have a thousand questions.  Note, you have to be able to provide at least theoretical answers to these questions in order to interview a candidate this way.  Second note, if you can’t interview someone this way, you probably should not be involved in the decision making process for new IT hires.

A *really* good sysadmin will ask questions about the physical facility, budgeting, and office politics, not just technology.  They’re going to want to know if they’re going to be able to fix things based upon technological merit, or if there’s a labrynthine approval process that goes through someone who has no technical expertise but absolute veto power over technology decisions… but if you get someone like this in an interview, be forewarned that you’re either hiring someone who will replace your IT manager within 6 months, or someone who will need some other sort of upward mobility within 18 months or they’re going to get bored and go elsewhere.  The price of hiring really great people is that you need to give them really high level work.

We’ll talk about question #2 in the next post.

Categories: management, tech

Niel Nickolaisen is My Hero

February 4, 2009 padraic2112 Leave a comment

I have a pile of “CIO Decisions” on my desk.  They’re good bathroom reading for IT people.  The content is not what I would call deep and rigorous, but even when it is just complete fluff it is interesting to see what sort of complete fluff other IT people are currently thinking about.

The September 2007 issue had a final article entitled “How I Fixed My Telco Billing Problems”.  Full article is available online here (you need to register to read it).  From the article:

What am I mad about? Invoices from telecommunications providers (voice, data, cellular). It seems to be standard industry practice for their invoices to be wrong. Not marginally wrong, but very wrong.

No kidding!  When I ran the phone switch and took care of the telco billing at Idealab not a month went by that I didn’t find something wrong, somewhere.  For example, a vendor who shall remain nameless simply forgot that they had provided us with a DS3.  Of course, it was several orders of magnitude more likely that they’d charge us for something (or charge us at some rate) that was incorrect.  Slogging through those bills was a huge time sink.  Niel’s solution?

I am changing the relationship (and my contracts) with my telco providers. A few weeks ago I started the process with one of my major providers. Instead of having someone on my staff scrutinize their bills to ensure they are accurate, I told this provider that it is their responsibility to send me an accurate invoice. If this requires them to hire additional staff to replicate what my staff is doing (or even to hire one of the cottage industry vendors), so be it. Rather than scrutinize their bills for accuracy, I will do some sampling. The first time we find an error, I require the provider to give me a 10% credit, recalculate the invoice, and try again. If the rebill still contains errors, the provider gives me another 10% credit and tries again. Perhaps this will give the providers a financial incentive to get things right the first time.

Brilliant.  Niel, you deserve a bonus.  If I was stuck running telco again, renegotiating those contracts would be job #1.

Categories: management, networking, tech

Downtime: Amazon S3

ReadWriteWeb reports an outage over at Amazon S3:

Today’s big news is that Amazon’s S3 online storage service has experienced significant downtime. Allen Stern, who hosts his blog’s images on S3, reported that the downtime lasted 3.5 over 6 hours. Startups that use S3 for their storage, such as SmugMug, have also reported problems. Back in February this same thing happened. At the time RWW feature writer Alex Iskold defended Amazon, in a must-read analysis entitled Reaching for the Sky Through The Compute Clouds. But it does make us ask questions such as: why can’t we get 99% uptime? Or: isn’t this what an SLA is for?

A six hour outage does not represent a violation of 99% uptime.  If you’re looking for 99% uptime, you’re looking at 87 hours 36 minutes of downtime every year.  Six hours of downtime is between four and five nines, folks.  If this is the second 6 hour outage of S3, get ready.  You’re 12 hours down, 75 1/2 hours to go in 2008.  Heck, you should be happy, you’re way ahead of the game!

And, as I’ve pointed out before, you’re not getting enterprise service because you’re not paying for it.

Categories: management, news, tech, web sites

Off the Nightstand: Managing Humans

July 9, 2008 padraic2112 2 comments

Full title, “Managing Humans: Biting and Humorous Tales of a Software Engineering Manager” by Michael Lopp, writer of Rands in Repose.

I appear to have been bitten by a bug (a viral bug, not an insect) and have spent the last four hours in bed on vacation, during which I pounded through this quite handily.  Lopp is hilarious and engaging and spins some interesting yarns that are applicable to anyone who manages people, or anyone who has a manager, regardless of industry.

The book is mostly (or perhaps all, I didn’t check rigorously) comprised of existing blog posts, so if you’ve been a follower of Rands in Repose for a while, you’ll only be interested if you’re like me and appreciate books as works in and of themselves as entire entities… something David Weinberger, author of one of my other nightstand occupants “Everything is Miscellaneous: The Power of the New Digital Disorder” would undoubtedly find quaint.

If you are an IT worker or a manager of any stripe, Managing Humans is required reading.  I choose a quote from page 111 as my favorite section of the book:

Fact is, your world is changing faster than you’ll ever be able to keep up with, and you can view that fact from two different perspectives:

  • I believe I can control my world, and through an aggressive campaign of task management, personal goals, and a can do attitude, I will succeed in doing the impossible.  Go me!

Or…

  • I know there is no controlling the world, but I will fluidly surf the entropy by constantly changing myself.

Surfing entropy takes confidence.  This isn’t Tony Robbins confidence; this is a personal confidence you earn by constantly adapting yourself to the impossible.

Good stuff, and interesting insight from someone who has written interesting and involved dissertations on pens, notebooks, and coffee mugs.  Hang ten, everybody.

Categories: books, management, software, tech

Sounds like Bob Might Have Called This One

Hewlett Packard is buying EDS.  From NPR’s News In Brief:

Hewlett-Packard, EDS Agree To Merge

The nation’s largest personal computer-maker, Hewlett-Packard Co., has agreed to purchase Electronic Data Systems Corp for $12.6 billion, the empire founded by H. Ross Perot, as part of an effort to position the company to take on rival IBM.

The companies said the deal values EDS at $25.00 per share, a 33 percent premium to its closing price on Friday, before reports of merger talks sent the shares soaring on Monday.

EDS would bring its expertise in running computer systems and providing other high-tech help to Palo Alto-based HP. That field is currently dominated by IBM Corp., which generated $54 billion in revenue from technology services last year.

HP indicated it will make significant layoffs as it eliminates overlapping jobs and other expenses.

Now, HP hasn’t really been involved in high-tech outsourcing management.  Why would they pick up EDS, at a 33% premium?  The only thing I can think of is that the folks over at HP on the board think that IBM’s weak enough in this arena (see many of Cringely’s recent articles about IBM) that merging up with EDS gives them a chance to jump into that market with a big gun… and that the near-sourcing market is going to pick up.  That and they realize that the hardware market is high cost low margin, so their core business model needs a little tweaking.

Categories: management, news, tech

There is no such thing as a technological solution

I’ve been working in the IT industry in one way or another since I graduated college in 1993. That’s 15 years now… wow, seems like it hasn’t been that long.

I’ve been involved with many different IT projects in many different organizations, and I’ve seen or heard or been exposed to a thousand more. I’ve seen successes and I’ve seen failures. Overall, more failures than successes. This shouldn’t be a surprise to anybody, the industry storybook is rife with tales of colossal failures… maybe 5 failures for every success.

Here’s why IT projects fail. I’m going to tell you all, so that you’ll know (if you’re a sysadmin or a programmer or whatever) how to avoid them, or you’ll know (as a non-IT person), how to recognize when your IT department is starting something that is very very likely to cost a bucket of money and return very little, except to give you fodder to rake them over the coals when you’re at the water cooler with someone else from Accounting.

There is no such thing as a technological solution. There is no problem that you can solve with technology. Stop thinking that you can, because when your thinking starts at that point, you’ve already started building a foundation without checking to see whether or not the ground can support any weight.

When you’re an IT worker, people bring you problems all the time. Sometimes, they’re not really “problems” at all -> there’s a bug in some software, or something is mis-configured, or some other thing that may take you minutes or hours to fix. This is really the equivalent of putting a band-aid on a wound. The real goal is to prevent infection until the wound heals. Eventually, the software will be replaced with a new version, or the main router will come back online, or whatever… and the work that you’re doing now will be essentially wasted time. Important time, granted… customer-service enabling time -> you’re saving them time at the expense of your own.

With these sorts of problems, you’re a mechanic. You’re a plumber. You’re finding out what doesn’t work in technological system and patching it or working around it. This is the grunt work, the scut work, the stuff that keeps us employed on a daily basis. You’re not providing a solution to a problem. You’re hacking. This isn’t a bad thing, it needs to be done. But this is firefighting. Optimally, you want to do as little of this as possible, because you’re at heart very lazy, and you know your customers want everything to “just work”.

Real problems start deeper. “I need a way to let people see my time schedule” is a problem which requires a solution. “My administrative assistant can’t sync my Treo to the corporate Exchange server” isn’t a problem that requires a solution -> it’s a bug that needs a hack. When people bring you bugs, hack. When people bring you problems, you need to build a solution.

This always, always, always needs to start with information gathering. Period. Always. If you’ve worked in four organizations before, and you’ve run Exchange, and someone comes to you with “I need a way to let people see my time schedule”, odds are very very good you’re going to blurt out, “Well, I could set up an Exchange server…”

Don’t. Cease. Back up. You’re doing it wrong. Period.

You’ve made the first mistake, you started building a house… and you don’t even know that what the customer wants is a house.

Sometimes, a someone comes to you with, “I want you to set up an Exchange server…” and you’re going to blurt out, “Okay, I’ve done that before, it’s pretty easy…”

Don’t. Cease. Back up. You’re doing it wrong. Period.

You’ve made the second mistake, you started building a Victorian because someone told you they think they need a house. The customer doesn’t know what they need. They know what they *want*. It’s your job to figure out if what they *want* is actually what they *need*. Moreover, it’s your job to know if what they need is possible. Sometimes, it’s not.

If you tell them that it *is* possible because your boss is scary and shouts and says, “Don’t tell me what’s impossible,” when you argue with him, I’m begging you… get into another line of work. Eventually you’re going to get fired, or you’re going to get fed up and quit, and the next poor bastard who comes in is going to spend months of aggravation trying to fix the piece of junk you built because you didn’t have the gumption to tell someone that they ought not to build a skyscraper on top of a bog.

The only thing you can do with technology is operationalize a solution. Information Technology work is *enabling* work. We take solutions and we build stuff to make them happen… but the solution has to already be known to some degree. You have to design a process before you start building an object. If you don’t, you’re going to build a really pretty object that nobody uses. You need to know what it is, not necessarily in minute detail… but you’d damn well better have a good idea that it’s supposed to be a house, if it’s supposed to be a house. Whether or not it’s a Victorian or a ranch or a McMansion is important, but it’s not as important as starting off in a residential zoning area.

You need to keep your eye, always, on the solution… and NOT on the technology. If the technology doesn’t fit the solution perfectly… well, that’s not always bad, and that’s not entirely unexpected. You can’t redefine success by changing the game to “I successfully deployed this technology” because deploying the technology isn’t what the customer wants, they want the problem solved. Define what subset of the problem the technology is fixing, and make sure your customer is satisfied with that subset before you build the thing.

And if they want you to build a Victorian and you’re in a commercial zone, suck it up and tell them “No.”

20% Time: The Blog

March 13, 2008 padraic2112 Leave a comment

This week on Scott Berkun’s pmclinic mailing list we’ve been discussing pet projects, developers, management, and Google’s well-publicized and usually misunderstood “20% time” theory: people spend 80% of their time doing “real work”, and 20% of their time working on “pet projects”.  In practice, there’s a lot more to it than that one sentence (and it’s not new, 3M did this before Google).

Here’s a company that’s actually not only adopting the idea, they’re blogging about it.

I think there are lots of ways to motivate people, and this is an interesting but not universally applicable idea… but I have to admit, Atlassian deserves some major credit for not only engaging in a thorough look at the idea (at a pretty large cost to themselves, they estimate $1,000,000 is the price tag), but putting the consequences, results, and trials and tribulations out there for anybody to read.  Open Source Management, how cool is that?

Categories: management, tech, work

Hanlon’s Razor

Never attribute to malice that which can be adequately explained by stupidity.”Wikipedia article

Pete Yost of the Associated Press reports (via law.com) on the now-infamous loss of a large volume of email from the Whitehouse email archive. Although many believe the loss of this data to be suspiciously convenient for the Bush Administration, the description provided in this article leads me to believe that this is all-too likely explained by simple idiocy. Some money quotes from the article:

“I would call this negligence,” said Mark Epstein, director of technical services for Cataphora Inc., a California company that specializes in retrieval and analysis of electronic information.

The White House e-mail troubles began in 2002 with a decision to upgrade electronic message capabilities and move from Lotus Notes to Microsoft Exchange.

“This is the first time I’ve personally run across this kind of process for archiving; the White House relied on human beings to do specific manual processes on a regular basis and I would not recommend it,” said William Tolson, who has consulted on e-mail problems for hundreds of companies and state and local governments.

Computer experts point out that the switch to a new system that the White House botched is successfully accomplished every year for e-mail systems that serve much larger numbers of users than the 2,000 at the White House.

Having done a few transitions from system A to system B, and seen several others in practice done by various people in various sets of circumstances, I can all too well imagine how this project managed to befoul itself. Moving from one monolithic system (Lotus Notes) to another (Exchange) requires a well thought out transition plan, which requires either (a) someone well versed in both monolithic systems, or (b) someone who will manage two people who are well versed in each monolithic system and keep outside agendas from placing pressure on either of the two experts. Finding someone well versed in both Notes and Exchange is pretty hard; finding someone who can keep two experts focused on the transition while dealing with outside pressures is just as difficult.

Monolithic systems transitions all too often are less concerned with preserving data than they are with the functionality desired “post-transition”. This is understandable: the whole point of moving from one system to another is to get where you’re going, and get away from where you’ve been. Managing this process intelligently requires you to step back from this pressure and figure out not only where you want to be, but how you don’t lose the good parts of whatever you’re discarding (e.g., the old data).

It’s a pretty rare transition where you can just discard the whole kit n’ kaboodle of the old system.

Digg!

Categories: management, tech

New Labor-Saving Devices, the Uber-Nerd Edition

January 23, 2008 padraic2112 1 comment

I have no idea how much this costs. I have no idea how robust the back end is. I have a million questions about the product already. But, 3tera, you have my attention.

This interface is TOTALLY COOL.

From Larry’s blog:

These services allow one to develop applications without hardware or datacenter cost. They bill for resources used — CPU time, storage, bandwidth. That means there is essentially no cost while an application is being developed and debugged since there is no traffic. When the application goes live, the capacity, and hence cost, grow and shrink dynamically depending upon utilization.

It’s a good time to be a startup, that’s for certain. The barrier to creating a proof of concept application is plunging downwards at an amazing rate.

Digg!

Categories: management, tech, web sites