Friday, December 26, 2008

Priorities

A few years ago, when I stayed in some hotels, an Internet access was just becoming really standard, there was a per day fee to use it. On the other hand, the hotels also served a decent complimentary continental breakfast for all guests.

Now, when I recently stayed at a few hotels, Internet access was completely free, but now the free breakfasts are gone.

My, how priorities have shifted.

Saturday, November 22, 2008

DoD Computer Security Decides to Pull Pants Up

As I'm sure everyone has heard the Department of Defense has decided to ban all removable media from their computer systems, mostly due to viruses running rampant throughout the (presumably) otherwise secure networks.

Now, people have pointed out that this is a pretty drastic step. After all, there are other ways of handling things that could have theoretically prevented this particular problem without inconveniencing users quite so much. Up to date virus scanners, security policies disabling autorun, restricted privileges on user accounts - all of these things would have helped reduce the ability of such a virus to spread. They should all be considered pretty basic measures in any reasonably high security environment, and it's quite possible that they were at least partially in place.

But there's an elephant in the room that I haven't seen anyone else mention, and would like to point out. Microsoft declared its security Initiative in 2002. In the six years since, we've had two major service packs, and a whole new OS.

So will someone please, please, please tell me why, in this day and age where security breaches make the news weekly, the default behavior for Windows is still to take any newly inserted media and automatically try as hard as possible to run whatever it happens to find on it? It was simply annoying on Windows 95, but it's downright dangerous now.

Come on, Microsoft. I would expect that any operating system that calls itself "Professional" would show a little more restraint than a two year old trying to eat a piece of gum it just peeled off a New York sidewalk. Time for Windows to grow up a little and break this dirty habit.

Saturday, November 15, 2008

Money Makes the World Go Round

It's no big secret that the financial world is going through what can be kindly described as a catastrophic disaster. Stock markets, profit margins, layoffs - all of the meters are currently pointing somewhere between bad and worse.

Likewise, there are plenty of people out there expounding on how we got into this situation, mostly pointing at the various shell games that Wall Street has been playing with mortgages. I don't really have anything to add on the twenty plus year saga of how we've made a bubble big enough to take out neighboring markets when it popped.

Instead, I just have a very simple observation to make. One that the entire financial industry has not simply forgotten, but must continually and actively ignore in order to continue to exist.

To put it bluntly: money has no intrinsic value.

Now, before you just laugh at me, think for a moment about this idea of value, or utility. While the utility of something can vary widely from person to person, and place to place, some things are more universal. For example, no matter who you are, food has some value. Everybody eats. The value of food can be influenced by the skill with which it is prepared, or the ration between its supply and demand, but its base value is directly created by its intrinsic properties. A pound of rice is always a pound of rice, and can always be made into a meal.

So the question, then, is where does the value from money come from? Or to put it a little more viscerally, why is having a pocket full of cash better than nothing but an empty wallet?

The answer, obviously, is because you can buy stuff with it. But what if you took that option away? What good would that money do you in everybody's famous hypothetical scenario, stranded on a desert island? Quite simply, none! A hundred bucks worth of military rations would be a thousand times more valuable than a hundred dollar bill. Moneys value springs purely from our collective agreement to pretend it has value. When you take away the ability to convert money into something else with immediate value, you remove the indirect value of money, revealing its utter lack of intrinsic value.

If you're still not convinced, then ponder this riddle. If the carefully crafted metallic sculpture that we call a "coin" and mass produce at US mints has value, then why doesn't an exact replica that came from someone's basement also have the same value?

What we call the financial trading world, though, is built upon a willful ignorance of this fact. The industry is built upon layer after layer of abstraction, and at each one, the intrinsic value that is abstracted into money is further diluted.

Consider day trading. Throughout the day, any given stock will have some degree of fluctuation. Even if it ends the day at exactly the same price it started at, there will be points where the price is up, even if only a few cents, and other points where it is down. With modern computers, it is possible for even a casual investor at home to have automatic orders rapidly buy and sell the same stock over and over again. Buy the stock at $1.00, sell it at $1.05. Wait for the stock to fall back to $1.00, and do it again. In the days of conducting business over the phone, the cost of the phone calls alone could easily have swamped any profits made. In the days of computers, though, anyone can cheaply run the switch a thousand times a day, with tremendous cumulative effects.

But where did this value behind the money come from? No work was done. No commodity was created. No service was performed, or even promised. Nothing was proffered for this creation or transfer of wealth, not even a kind word. The whole stock market system was intended to be, like currency, an abstract representation of underlying value. A share of stock in a company is a voucher for a fraction of the total intrinsic value of that company.

With the introduction of computers and near instantaneous trading, though, the rules changed. The speed upped the pressure behind this loophole, and money suddenly started gushing through with disregard for the rules. Why bother with all of the tedious research, hoping that the stock will go up a substantial amount, when you can make money off of random noise? As long as the stock doesn't completely tank, you're fine!

Day trading was by no means the first means of exploiting a loophole. But in the last few decades, as regulations have simultaneously become more byzantine and less restrictive, the opportunities for making money by creatively shuffling money around have become more potentially lucrative and tempting. Why go through all the effort of actually creating intrinsic value, when you can not only carefully stack up your bills to make one plus one equal three, but do it a thousand times over?

Friday, July 25, 2008

Yahoo! Music Store

Okay, so the Yahoo! music store is the latest one to shut down, taking with it the DRM authorization servers required to use the "purchased" music. (Since I've never touched Yahoo! music, I have no idea if the DRM servers are required to play music, every 90 days, when you want to move computers, or what. The relevant bit is that you'll run into a problem sooner or later with the servers gone.)

This has been covered before, so let's just quickly recap:

  1. You can't buy DRM encumbered media, only lease with an option to get screwed.
  2. The option to get screwed is exercised at the discretion of whoever owns the DRM infrastructure.
  3. Do you think that the company actually wants to keep a whole collection of servers up and running for the last three people using purchases from a music store that was discontinued 4 years ago in favor of a new, more profitable one?
  4. Revoking DRM is a brutally effective method of forcing consumers to leave an old platform, in hopes they'll all sign up for its successor. The fact that customers were happy with the old platform isn't perceived as a downside; it's the reason why the company is doing it in the first place.
  5. Strong DRM means that companies can use technical means to enforce policies, regardless of their legality. Existing code doesn't automatically update to reflect new court rulings, and your only appeal is with the companies helpdesk.

All those of you who have been writing about the dangers of DRM may now proceed to jump up and down while shouting "I told you so!"

Saturday, June 21, 2008

Lies, Damned Lies, and Marketing: A Plea to Netflix

I work in IT. Not surprisingly, this means I get to do a fair amount of support for broken computers. In my case, it's mostly for a handful servers from a particular well-known vendor. Since we pay a premium for the top-level support, I tend to be pretty satisfied when calling in for failed components. The calls basically tend to consist of "What's broken?", "Let's run a quick diagnostic to make sure", "Do you want a technician or just parts?", and "Do you want it there tomorrow, or this afternoon?".

Then one day, I had to make a call in for a desktop. Same vendor, still had one of the higher level support contracts, and still quite obviously a hardware failure.

Unfortunately, this meant that instead of getting routed to a bunch of IT-savvy techs determined to keep my downtime to a minimum, I got to deal with the general home user support group.

Now, I do enough end user support to be able to sympathize with quite a bit of what these guys go through. I really don't mind them asking me really basic questions like "Is the computer on fire?"; I've had users who would neglect to mention this when asking why we turned off their Internet. I completely understand them strongly wanting to get an error code back before they'd start shipping replacement parts; I wouldn't be surprised if they've had users who didn't understand that you need to put a blank CD in before they can make a mix CD of their pirated MP3s. I won't pretend to like these things, but I understand they're necessary and don't hold it against the poor people at the other end of the line.

No, what bugs the hell out of me when they keep claiming they're "sorry". Yes, that's right, every time I talk to a new person, and every time I mention something that's a problem, they rattle off, all in one quick, unconvincing, insincere, scripted breath, "Oh-I'm-terribly-sorry-sir-I-feel-really-bad-about-that-I-hope-that-we-can-fix-the-problem-and-I'm-sorry-for-the-inconvenience" .

Oh, really? You feel personally bad about every annoying user with a broken coffee cup holder who can't tell you if it's plugged in because the power's out? Bull. After the fourth or fifth time, I'm actually far more annoyed than if you just said "Okay" and punted me off to the tech in line, because it's quite obvious that you're lying to me. I'm paying the extra support money for tech support on the product. If I wanted someone to talk to and empathize with me, I'll to find a qualified therapist and talk about my childhood, thank you very much.

Where was I going with this? Oh yes, Netflix.

As I'm sure that anyone who has a Netflix account, reads techie news sites, has an Internet connection, or uses electricity has heard by now, Netflix is removing the profiles feature, which lets you split up a single account into separate queues and preferences. This lets multiple people share a single account, rather than each buying their own - perfect for households with more than one person.

Now, the canceling of this feature is bad enough. My wife and I use this, and let me tell you, it's a lot easier than trying to come up with ratings that accommodate chick flicks, romantic comedies, sci-fi, and anime. I mean, seriously, how many people really like all of those categories?

As if that weren't bad enough, though, Netflix had to take it one more step. They decided to just give all their profile users a father-knows-best pat on the head, and tell 'em "It's for you own good." Like the tech who personally fells the pain of each and every of the thousand customers per day, Netflix has spun a falsehood that is insultingly transparent:

As a Netflix product manager I'm tasked with the wonderful job of helping members find movies they'll love. But today my job is more challenging as we've decided to terminate the profiles feature on September 1. Please know that the motivation is solely driven by keeping our service as simple and as easy to use as possible. Too many members found the feature difficult to understand and cumbersome, having to consistently log in and out of the website.

Let me get this straight. You have a feature that, while perhaps not wildly popular, is strongly loved by those who do use it. "Some" people allegedly find it "confusing" (we'll assume for the moment that Netflix has legitimate data to back this claim up), so rather than, oh, I don't know, fixing the problem, you just decide to nuke it completely. How does that "help" users?

Now, where Dad could give 5 stars to Goldfinger, Mom could give 5 stars to Pretty Woman, and Junior could give 5 stars to Shrek, Netflix will be trying to analyze a single person that would give 5 stars to all three movies. I can only imagine the bizarre recommendations for such split personality victim! How does that "help" users?

Before, each member of the household would have their own queue, and would get their own next movie for each one sent back. Now they'll have to carefully shuffle the queue each time one goes back to make sure that the right next movie goes back, or else Junior sending the cartoon he just watched back will land Julia Robert's latest movie in the mailbox. How does that "help" users?

If you're going to pull out some backend code that implements this feature, fine - but I doubt there's a software engineer on the planet who thinks it's a good idea to pull a feature away before you have something more compelling to convince your customers to give you money.

If maintaining the feature is taking up too much time, or is getting you stuck in some expensive patent war, then tell us you can't afford the feature and we'll probably understand and get over it.

But please, please, please - don't just rip the feature out of our hands and tell us it's for our own good. It's a blatant lie of the worst kind - a marketing lie - and once your customers think that you're lying to them, they're quite liable to take their money off to one of your competitors in a hurry.

You can trust me on that.

Wednesday, June 4, 2008

What Time Is It Anyway?

I can't believe that I feel I need to write this post. Really. But, I do, so here it goes.

Start by asking yourself a question: what time is it? A simple enough question, with a simple enough answer.

Now pretend for a moment that you had 100 people scattered around the globe on a conference call. Now ask them all, at precisely same moment, what time it is.

(Hands down, all you physics majors out there. We're ignoring relativity, since this is all make believe anyway, so everyone agrees it happens at the same instant in time.)

Now all of a sudden the answer to your question isn't quite so straightforward anymore, is it? You have to worry about dealing with multiple timezones, the international date line, and daylight savings time. The only way to deal with this is to use dates that explicitly include the timezone. Trying to deal with dates and times missing timezones is like trying to use latitudes without longitudes, or an email address without a domain. As soon as the scope expands beyond a very tiny size, it breaks down quickly.

Now, the fact that just about any standard formatted timestamp includes this information seems like it would make this pretty obvious. Email, HTTP, filesystems - they all either include a timestamp, or are universally defined as relative to a fixed timezone that you can easily base off of.

So will someone tell me why, in this day and age, the derby database chose to define timestamp columns that are missing the timezone? You wouldn't forget to make numerical types with floating point support, would you? Or strings that didn't support storing lower case? Or... well, you get the general idea.

So come on, guys. It's a big world. Databases are all about sharing information, and these days even a modest open source project can easily be sharing between half a dozen timezones across three continents. At this point, you're just making yourselves look silly.

Friday, May 16, 2008

Successful Failures and Faulty Successes

Anyone who attempts to generate random numbers by deterministic means is, of course, living in a state of sin. — John von Neumann

So unless your job has nothing to do with IT, or you've been living under a rock somewhere out of Blackberry range, you've no doubt heard about the utterly terrifying Debian OpenSSL vulnerability which left all those vast 4096 bit private keys into effectively 15 bit keys. (For those of you who don't speak crypto, this means that the bad guys can guess your private key in about 32 thousand guesses - pretty trivial for any modern computer.

The problem, ironically enough, came about when Debian developers attempted to fix some compiler warnings about a function in OpenSSL that was using uninitialized memory. While normally a horrible idea, OpenSSL was using this as an additional source of entropy, to make the private keys it generated more random. Unfortunately, the actual result was to remove nearly all entropy, leaving only 15 bits behind from the PID.

I think it's pretty safe to say that this was a catastrophic failure. Now Debian has done an admirable job of releasing a fix to the tool that generates the weak certificates (though you still have to go back and replace already existing ones), the question remains - how in the world did OpenSSL exist in this blatant failure mode, completely undetected, for two years?

This problem is a beautiful illustration of part of why cryptography, and security in general, is so devilishly difficult to get right. In normal software testing, successes are successes, and they're good, and failures are failures, and they're bad. Simple enough, right?

When you're testing security, though, things are different. In security, you also have to make sure test for what I like to call successful failures, and faulty successes.

Take a firewall system, such as iptables or pfw, for example. Without one, if a client attempts to make a TCP connection, it expects to succeed. If the connection succeeds, the test succeeds; if it fails, the test fails. Once the firewall is in place and configured to block that connection, though, that connection damn well better fail! That counts as a successful failure - a case where you succeeded in selectively making something like a TCP connection fail in a case where it would be undesirable. Likewise, when a file is encrypted, unauthorized attempts to read it (or at least, extract meaningful data from it) by anyone without the appropriate key is expected to fail.

Likewise, you also have to test for the inverse case. Back to our firewall example, let's say that the admin carelessly mistyped the mask, leaving our service unprotected from ranges that we don't want to have access. Connection attempts will all of a sudden start succeeding where we don't want them to. We now have a faulty success. Even worse, we won't notice unless we happen to test from the tiny sliver of IP addresses that were erroneously granted access. Back in the crypto realm, this is what happened to OpenSSL. It failed to prevent success, where success means a request that should have been prevented was not blocked.

These additional twists on defining success and failures help to make testing security software and configurations devilishly difficult. The OpenSSL bug didn't cause any visible changes in the test results. Everything still compiled; the output was still valid; data was encrypted and decrypted properly; no regression tests failed.

Hopefully someone someday will figure out a more reliable way to test this kind of code than the current method of having people who've forgotten more about math than most of us ever even heard of stare at it until drops of blood appear on their foreheads. Until then, we'll just have to be ready to roll out patches, scramble passwords, and revoke certificates when the next inevitable vulnerability or system compromise happens.

Saturday, April 26, 2008

If It Weren't For My Horse...

Lewis Black has this great bit he does, where he describes being out and about one day, and hearing a stray phrase from a nearby woman grab his attention like flypaper grabs a mosquito:

If it weren’t for my horse, I wouldn’t have spent that year in college.

Well, today, while doing a little research on network monitoring packages to see what's new out there, I ended up taking a peek at how Big Brother is doing these days. On their page, in the License section, lay another equally enigmatic phrase:

Big Brother is distributed under our Better than Free license. Clause 2 from that license determines whether you need to buy a Commercial license.

I'm sorry - what? It's better than free, but I have to worry about buying (as in, non-free) a license in some conditions. I really have no idea of how to reconcile those two sentences together without application of drugs. I recommend you don't try too hard, either, or you risk at minimum a migraine.

Let's just look at the first half. Better than free? So unless they're paying me to use their software, then not only is it free, but they throw in something else, too, like cake. Now, it is true that you can download and start using the software at no charge, but there are some strings attached. To be fair, let's compare their version of free to that famous poster child of free software, the GPL.

 Better than FreeGPL
Duration30 days, and then you have to buy a commercial license.Perpetual, until the terms are violated.
Source CodeOnly if they feel like giving it to you.If you obtain a copy of a GPL binary, you are entitled to the source code that generated it.
Derivative WorksStandard commercial no modifications, no reverse engineering, "hands off!" clause.Ensuring that each user has full rights to create, modify, and distribute derivative works is the entire purpose of the GPL.
TerminationLicense is terminated on breach, or also with a 30 day notification on the web site.Only on breach of terms.

Uh-huh. So you've only got 30 days, you're not guaranteed the right to delve into and modify the software, you can't give it or any changes you make to anyone else, and they can change the terms on you whenever they like. And no cake.

If that's "better", then I'll stick with plain old GPL and boring old "just free".

Sunday, March 30, 2008

Microsoft and "Vista Ready"

There have already been oodles of articles out there talking about the fiasco that is the "Vista Ready" vs "Vista Capable" fiasco. Simply put, there's evidence (including internal emails) that Microsoft lowered the standards required to meet Vista Capable. This resulted in machines with the Capable sticker barely being able to run Vista at all, let along certain advanced features, most noticeably the Aero interface, and much user annoyance and confusion.

This gives me a good opportunity to point out a common misconception about Microsoft. One that most people have, and which leads to a great deal of confusion about why Microsoft does what it does.

The confusion, simply stated, is that people think Microsoft makes software. It doesn't.

Now, now, I know what you're all thinking. What about Windows? and Office? and SQL Server, and MS Money, and all of the other Microsoft products lining the shelves at Best Buy? Okay, so Microsoft also makes advertisements. So is it an ad company? How about a payroll company, since it pays its employees?

My point is, those boxes of bits are, when you really get right down to it, in the same category as ads and pay stubs - nothing more than a means to an end. And that end is, of course, money. (The green paper stuff, that is, not the aforementioned program.) In other words, Microsoft doesn't make software; it makes money through its expertise at making software.

So why should you as a random consumer of Microsoft care? Because each and every decision that makes will have an implied footnote, a hidden subtext that reads like a banner out of Office Space: "Is this good for the company?" Each potential action will be weighted based on how much money it makes, or loses.

Sure, there will be plenty of consideration about what's good for customers, but let's face it - if Microsoft went out of its way to screw over consumers, it would have a difficult time convincing those same customers to give it money. Beyond that, Microsoft is a big company with a lot of people in it, and no doubt quite a few of them really do try to do right by their customers.

But in the end, like every other publicly traded corporation, Microsoft has to answer to it's shareholders. And each and every decision is evaluated, not on how popular it is, or on technical merit, or if it follows standards, or even ethics - but what it does for the bottom line.

Friday, March 14, 2008

Energy

Looking back at the small bits of noise I've added to that gossip factory we call the Internet, I can see that I've only really bothered to talk about tech stuff, and somewhat esoteric bits at that. So, I've decided to change bandwagons mid-stream (to quote Eric Raymond, I like my metaphors shaken, not stirred) and talk a little bit about environmental issues.

More specifically, I'd like to talk about what is, at least in the long run, the single most important issue: energy.

Everything we do, from research, to cooking, to transportation, to taking a walk around the room requires energy. As our societies become larger and more technologically sophisticated, we require our energy be delivered both in greater total quantities, and in higher density packaging. A millennium ago livestock, crops, and farm animals were enough; a century ago modest amounts of petroleum products and electricity sufficed; now it's all we can do to keep energy production matching pace with ever increasing demand.

One of hot topics that obviously flows from this discussion is where we should be squeezing all of that energy from. So what choices do we have?

  1. Oil
  2. Coal
  3. Wind
  4. Water (hydroelectric dams)
  5. Nuclear
  6. Hydrogen
  7. Solar

Now all of these various sources have advantages and disadvantages. Water and wind are clean and renewable, but hydrocarbons have far higher energy densities. But what happens when we step back and take a longer term view? And when I say "longer", I'm not talking about the "not one, but two quarters ahead!" view that seems to dominate most companies and public debates, but a true, seven generations out view.

Let's start with hydrogen. Hydrogen does occur naturally, but not in huge quantities. The big challenge with hydrogen isn't using it, it's creating it. Hydrogen is not an energy source, it's an energy transport. It's one solution to a huge piece of the problem, but it's not a source. Since we're only talking about energy sources, let's vote it off the island and see who's left.

  1. Oil
  2. Coal
  3. Wind
  4. Water (hydroelectric dams)
  5. Nuclear
  6. Solar

Let's go pick on nuclear power next. Now, if handled right, nuclear could potentially offer up quite a bit of power for quite some time. A string of properly set up breeder reactors can pass material down from one to the next, extracting additional energy at each stage. Even assuming that somehow, someone could muster up the political and financial capitol to make it happen, there is still only a finite amount of glowing rocks laying around to throw in the reactors. Once those pockets are used up, it's done, so let's throw it off our list too.

  1. Oil
  2. Coal
  3. Wind
  4. Water (hydroelectric dams)
  5. Solar

Now let's go all "green" here and go after the Big Bad Carbon Producers: oil and coal. When you get right down to it, they're nothing more than dinosaur and plant extract. And where did the stored energy that we pour into our gas tanks every day come from? Why, the sun, of course, as any middle school level science textbook could show you with one of those near little diagrams with arrows pointing in circles, and a picture of the sun off to one side pumping energy into the picture of plants. So since they're really just pockets of condensed sunshine (energy-wise) let's consolidate the list further by taking those fossil fuels off.

  1. Wind
  2. Water (hydroelectric dams)
  3. Solar

That list is getting might short, isn't it? But hey! At least what's left are all clean, renewable resources, right? But wait a minute. Wind and water are great, but what makes them move? What drives them? Or, as an actor would say, what's their motivation? Well, for wind, it's heating and cooling caused by - c'mon, guess - that's right! It's the sun again.

Water? We get energy out of water falling downhill, but something has to push that water uphill in the first place to store up that kinetic energy. More specifically, on the scale we're talking about, something has to evaporate it so it can condense into rain that lands at a higher altitude than it evaporated from. Which implies heating, which... yes, yes, it's the sun again. So, now our continually shortened list.

  1. Solar

And there you have it. All of the various energy sources we argue over, discriminate between, and tweak to squeeze more out of, are either solar, concentrations of stored solar, or finite resources doomed to run out, short of mining other planets in the solar system.

Now, I'm quite aware that solar power, as it currently exists, has problems. It's output is heavily influenced by weather conditions, efficiencies are still relatively low (especially when compared against the energy in a gallon of gasoline), and it's only in the last decade or so that a solar cell could be expected to produce more energy in its entire lifetime than it took to manufacture it. It's output is also limited to electrical or simple raw heat, and again, we don't have any kind of batteries that can compare to the energy transportation and storage of petroleum products.

(Some companies are looking at ways of making hydrogen more easily used by binding it up with other elements to make it more stable, such as carbon. Which gets you volatile hydrocarbons - aka, petroleum products!)

So in the end, we really don't have much of a choice. We can take advantage of little caches of energy, stored in plutonium or crude oil. We can pick the path we take to get to solar energy, whether it's through an intermediate, such as manufacturing hydrocarbons, or direct, such as boiling water or solar cells. But in the end, the sun is really the only source of energy that's going to hang around long enough for us to pretend that it's going to last forever.

(At this point, the pedants out there will point out that eventually, the sun will let us down by expanding out and destroying the earth, rather than providing us with a gentle stream of life giving radiation. I, for one, fervently hope that the human race is around long enough to have to worry about this.)

Sunday, March 9, 2008

(At Least) One Of These Things Is Not Like The Others...

I just recently read Freaknomics, an interesting book on economics with interesting ideas put forth by a guy who economists claim is more of a sociologist, while sociologists claim he's an economist. The book throws out and makes a stab at answering bizarre questions about seemingly unrelated topics, like "how are teachers like sumo wrestlers?"

So, in the same spirit, I'll start off this post with the same kind of question: What do Freakonomics, Netflix, and my last hospital visit all have in common?

Now, unless you've been stalking me, I really wouldn't expect you to guess how my last hospital visit comes into play, so I'll give you a hint. My hospital is well into the process of converting from thick, massive folders of paper records, over to digital records with a PC in every exam room. While the nurse was going through medical records and scheduling procedures, she apologized for taking so long, and complained that the software layout made no sense for her field, and obviously wasn't designed by someone who knew it.

Figured it out yet? One more hint - the challenge Netflix is currently running to find a better movie recommendation looks like it just might be won, not by some MIT team of CS majors, but by a phsycologist!

Every field, be it computer science, psychology, or medicine, has boundaries. Various ideas and concepts get sorted into the right field based on those lines. An algorithm for traversing a graph? CS. A study of the effects of a new drug? Medicine, further narrowed down by specialty.

There's a problem with those lines, though. No one told the problems we're trying to solve about them!

In the first two examples, exceptional results were found by doing work that happily straddled those lines. In the Netflix example, without psychology, he likely never would have had the insight required, and without CS, he never would have been able to actually implement it.

Likewise, in the hospital example, the fact that the software engineers who created the software weren't intimately familiar with the actual job created a system that didn't match the workflow. Instead of helping the nurses, they end up stumbling around looking for options and fighting the system.

We've all heard the joke about a bunch of blind men who stumble across an elephant, and try to figure out what it is by feeling it: "It's a snake!" "No, it's a tree trunk!" "No, it's a wall!" Well, guess what? We're all a little guilty of it now and then. It's only human to try to look for solutions within the one or two fields we're comfortable in.

So what should we do about it? Recognize that the sum of human knowledge may be sorted by the Dewey Decimal System, but it is not defined by it. Read outside of your field, and see what kind of tricks those guys who went to college in a different building may have up their sleeve. Working in a college myself, I can tell you that it's not too uncommon for someone who's sacrificed any pretense of breadth for incredible depth in one field to struggle with a problem solved decades ago in an apparently unrelated field.

And in the end, ask yourself which one you'd rather be - the guy winning a Netflix prize by fusing together two superficially unrelated fields, or the software engineer who gets yelled at because he used the wrong kind of chicken guts when divining a nurses workflow?

Friday, February 29, 2008

Windows World (Slowly) Learning From Unix History

The excellent Coding Horror blog has a short article up about one way of categorizing software: UsWare vs. ThemWare. The idea is simple enough. ThemWare is software that's only used by "Them" - ie, none of the users are also developers of it. UsWare is software that is used by the developers as well as others.

Jeff comes to the conclusion - which I happen to agree with 100% - that creating software as UsWare will, all other things being equal, lead to vastly higher quality than software created as ThemWare. To help this process along, he encourages his software developer readers to work to gain the user perspective, to eat their own dogfood. This is certainly a good idea.

But as I thought about it a bit, I realized that this is only half the picture. The focus here is to give the programmers more of a user perspective so they meet user needs better. But what if things could go the other way? What if users could get more of a programmer perspective, so they actually could communicate their needs effectively? And maybe, in the case of users who have some programming experience, be allowed to help out and contribute bits of code that demonstrate what they want with far more precision than any prose description. Either way, the end result is to break down barriers, and blur the line between developer and user.

Oh, wait. There's a name for that already - open source.

That phenomenon called open source software hasn't really caught on too strongly in the Windows world, in no small part because Microsoft does everything in its power to keep all of its source code under heavy lock and key. With how much Microsoft depends on license keys to enforce paying for software, there really isn't much of an alternative for them. Even more important, I believe, is the fact that Microsoft began from square zero by selling software to non-programmers. The people using those original DOS systems didn't want computers for their own sake, they just wanted them to run their business.

In the Unix world, however, things began completely different. While Microsoft was busy trying to sell computers to people who didn't want to know anything more about them than they had to, Unix was a programmers playground. Researchers used Unix, and often had to create their own applications, and were able to with compilers being commonplace on Unix systems. Unix was an environment created by programmers for programmers, and the result is that once you begin to feel a little comfortable as a Unix user, the bar to becoming a Unix programmer is fairly low.

As the Free Software and OSS movements had propelled Linux systems as the successor to the Unix heritage, this trend has only become more pronounced. These days, a typical Linux system will have two or three programmer-friendly editors, an IDE, compilers for C, C++, and possibly Fortran, lisp (if you count Emacs), java (Sun, or an open source alternative), and a handful of powerful scripting languages such as Perl, Python, and Ruby.

And that's just the typical stuff! For the Linux user truly interested in becoming a programmer, there are debuggers, Ada, Smalltalk, Rexx, Haskell, and countless other languages and development aids just a Freshmeat search away. With all those tools just waiting to be picked up, each and every open source user is a potential contributor, of anything from a bug fix, to feature enhancement, to documentation, all the way up to becoming a full fledged maintainer.

Jeff is absolutely right that programmers who learn what it's like to be users will end up producing higher quality software. But as long as you freeze out your users from becoming contributors, you're throwing away valuable resources that you often couldn't buy if you wanted to. And that's why Linux will always have an edge over Windows, no matter how many animations they add to Aero.

Friday, February 15, 2008

Plan For Failure

Vista "enhancements" include removing the ability to do repair installs. Screw Windows up a little too badly, and your only option is to reformat and reinstall.

Rim has an undisclosed problem with servers off in Canada, and suddenly every Blackberry everywhere goes offline.

Congress starts ramping up surveillance and blanket data retention, but never seems to worry about the fact that those same tools are equally useful for criminals.

What do these three disparate events all have in common? Simple. All of the design was built around what happens when things go right, not wrong. All three cases display a horrific lack of preemptive failure analysis.

Failure analysis is something that is taught to more established professions, such as mechanical or civil engineering. In these professions, where a screw up frequently can mean people die, worrying about when happens when - not if - something breaks is beaten into students until they think about it the way a deep sea diver thinks about his air supply.

When a civil engineer designs a bridge, he can easily end up putting in thousands of pieces. Some pieces, when they fail, are rather unimportant. If the dedication plaque rusts or falls off, a donor may be upset, but the operation of the bridge isn't compromised. On the other hand, if a rivet or weld holding a support in place cracks, then the engineer who signed off on the design is going to be very interested in what will happen. Will the bridge hold for a year? Six months? A day?

Every part has an MTBF. Just as important as knowing when that part is likely to fail is every bit as important as knowing what will happen when it does fail. Often times, an early analysis can find hidden critical dependencies that can be fixed or mitigated with simple design changes.

Take the Vista removal of recovery restores. Strictly speaking, removing this feature didn't add any failure modes. Unlike a new driver or filesystem, it didn't add any new ways for an existing Windows system to break. What it does, is ensure that once a failure beyond a threshold does happen, the impact will go from being recoverable, to being a death sentence for that copy of Windows. Without adding any new failure modes, the number of critical failures just went up.

Now if you ask the people who put these systems together, I highly doubt that they intended for these systems to fail. This seems obvious... But it's also the problem.

Every system out there will have a failure sooner or later. Let's be fair to Microsoft, by giving them a plus side. All Blackberries have their data go through Rim servers, despite having a perfectly good data connection from the cell provider. This adds a wonderful single point of failure. By contrast, Microsoft based smart phones don't need any such assistance. They're perfectly capable of talking on their own, without an extra translator.

Microsoft could take their entire infrastructure offline, and the phones wouldn't care. By keeping their own servers out of the data path, they've reduced the number of failure modes of Windows Mobile phones out in the wild.

If we programmers and IT guys want to be taken seriously, we absolutely have to start planning for failure. Throwing redundant servers at problems reduces the likelihood of failure, but doesn't reduce it to zero. RAID protects you against a single hard drive failure, but not multiples.

We have to start asking ourselves, with each and every component we build or install, what will happen when this system breaks? That's how you notice things like a pair of high end servers both plugged into the same $4.95 ValuePak power strip. That's how you put in exception handlers that, when that exception that can't possibly happen happens, at least ensure the program goes down gracefully instead of exploding with a corrupted database.

That's how we can start building systems where a single, simple stupid failure doesn't turn into a headline generating, career limiting fiasco. Then maybe those civil and ME guys will stop snickering whenever one of us calls himself a software "engineer".

Thursday, February 7, 2008

WTF is Google Thinking?

Google. The projects they do, the reactions they provoke, even the cooking in the cafeteria - whatever they do, almost always ends up being big. Unfortunately, with their latest "It seemed like a good idea at the time!" they're most likely about to piss of even more IT staff than when Google Desktop started copying files onto Google servers indiscriminately.

The description from the press release sounds innocuous enough:

Google (NASDAQ: GOOG) today announced Google Apps Team Edition as the simplest and fastest way for groups of employees and students to collaborate within an organization using Google Apps.

But then they go on:

Once users verify their business or school email address, they can instantly share documents and calendars securely without burdening IT for support.

ARS Technica had it right when they described this as Google trying to "sneak Team Edition suite past IT help desk". To those IT help desks Google is referring to, this is roughly like working to bring new an exciting drugs to market without burdening the FDA, or opening a new restaurant without burdening those poor health inspectors.

The problem is, Google is offering to host some set of end user data, but those end users quite simply lack the ability to evaluate whether or not Google is a suitable custodian of that data. Random end users shouldn't be expected to make those kinds of evaluations on their own. After all, why should an accountant worry about going over technical details of colocation and outsourcing details, such as key escrow management, encryption, etc, when you already have an IT department to worry about them?

In any decent sized company, this is how things are supposed to work. The business side of the house sets the priorities, then passes the goals and requirements off to the IT of the house, who picks the best solution on suitability and technical merit. Management sets the why and what, IT decides the how.

Google, on the other hand, appears to be trying to take that away. Now, I'll be the first to say that expanding the online Google tool suite is great. And adding in collaboration features is a pretty obvious next step.

But damnit all, Google has a responsibility to make sure this loaded gun is at least pointed in the right direction! If you want to sell liquor, fine - but that doesn't mean you should open up shop across the street from a high school. The last story that I heard of where users decided to go off and create a working solution on their own, the end results included an SSL free commerce web site and credit card numbers were tossed around in plain text email to be typed in. Collaboration definitely sounds like a powerful tool in the right hands, but IT still has to have a prominent role in picking which tool to use and how to use it.

Now I'm sure that the good folks at Google never intended to have sensitive data, like business plans or credit card numbers, passed around. The problem is, to an ordinary user, only moderately technically literate, the only difference between storing that top secret business plan on a secured server and Google docs is which bookmark they click on.

In a a managed corporate IT environment, the IT and business sides of the house have a close working relationship. The IT side understands enough of the business side to create a working system. At Boeing, the IT staff understand that plans for new airplanes are highly sensitive, and so can set up servers and encryption to protect it, and train users in how to use it to protect data. With Google, however, you get what they offer, and that's it. If Google apps doesn't meet your needs, you either end up with a hole that Google apps can't fill, or even worse, leaving data inadequately protected.

So the next time that someone who has no chance of understanding the implications of the fine print in the acceptable use policy goes off and leaks the company crown jewels by clicking the wrong checkbox in a Google app, will Google accept any of the blame? Or even more importantly, any of the responsibility of cleaning up the resulting mess? Tracing the extent of data leaks? Buying credit protection for identity theft victims?

Somehow I suspect that Google won't mind burdening the IT help desk with that half of the job.

Sunday, January 13, 2008

CS Majors Need Not Apply

Usually, I like Coding Horror. I just read a post, though, where he argues that CS majors should be taught more software engineering. He quotes CS students who have never been formally exposed in their entire undergraduate program to things that the professional field lives and dies by, such as deployment management and revision control. But then, he goes one step too far, right off the cliff:

If we aren't teaching fundamental software engineering skills like deployment and source control in college today, we're teaching computer science the wrong way.

Sorry Jeff, but I'm going to have to call you on this one. If you're teaching fundamental software engineering skills like deployment and source control, we're not teaching computer science at all - we're teaching software engineering!

Think about cars, and the people who design them. Typically, they went through a Mechanical Engineering degree. While this means that they did get a solid grounding in some underlying physics, such as heat transfer, stress transfer, and material analysis, the focus is on how to apply those areas to the real world. Complicated, precise physics formulas are replaced for approximations and tables designed to quickly and easily give an answer that may be less accurate, but errs on the side of safety.

On the other end are the actual guys who do the real hard-core science - physics. Mostly done on chalkboards and computers, these guys only delve into the real world to gather data or test out a hypothesis. There are quite a few good physicists out there who could explain to you in great detail how and why a tire has a particular amount of traction on asphalt, but couldn't actually change a tire if their live depended on it.

The important point here is that even though ME and physics of the properties of physical things, there is still a distinction between the abstract, research oriented side, and the dirty, messy, practical side. This is a distinction which most of the computer "science" majors out there seem to pretend doesn't exist.

Most of true computer science doesn't even have anything to do with computers. Take Big O notation. In computer science, if an algorithm takes an hour, a day, or a mon, as long as they scale linearly as the size of the input goes up, they're all O(n). Try to argue to a customer that they should be considered equal in any way, though, is likely to make for a short career as a programmer.

The harsh reality is that most companies advertising for computer science majors don't really want computer science majors. Sure, they want someone with a good knowledge of algorithms, but - as Jeff pointed out - the ability to use version control is at least as important. Grungy skills, such as creating crash dumps that allow you to get good diagnostics info about customer problems without having to ship them custom builds, while utterly boring from a pure CS standpoint, are worth their weight in gold outside of academia.

This isn't to say that software engineers shouldn't have a grounding in CS theory. There's going to be a lot of overlap. The difference is one of focus. Once we accept that there are really two majors trying to fit into one curriculum in most schools, we can start the process of trying to make a one size fits all, and stop trying to turn out physics majors that we expect to be able to design a camshaft.

Sunday, January 6, 2008

Typing Puppet Strings Onto Your Servers

Just like a good Perl programmer, a system administrator should strive for a certain degree of laziness.

Now, this is not the kind of laziness that leads one to think "Eh, I'm not going to bother installing that update." No, this is the form of efficient laziness that says "I could download and install that update, but there's got to be a way to get it done automatically without wasting my time." These are the kind of people who have libraries of shell scripts and packed cron jobs.

Now, those libraries of shell scripts are great, but they can be an awful lot of work to write and maintain. Not very lazy at all! So, rather than going that route, I've been working with (and on) a package called Puppet.

Puppet is a client/server package written in Ruby. Essentially, you configure the server with the configuration settings you want all of your machines to look like. The clients get pointed at the server, pull all of their settings down, and make them happen.

It's got a decent library of native types (such as packages, files, users, etc) right out of the box. If you need something that's not covered, it's fairly straightforward to write your own custom code (assuming you know Ruby) that allows you to extend what kinds of files and setting Puppet is able to directly manage. Thanks to some good helper libraries, I was able to whip up a custom module that allows me to manage entries in /etc/sysctl.conf is only 59 lines of code!

Some of the cooler features of Puppet:

  • All communication is XML-RPC based, making it easier to write custom programs that communicate with Puppet
  • Collections of facts about client systems (OS, OS version, etc) are reported back to the server and can be stored in a database
  • Defines and Exec allow you to create complex configurations without writing any Ruby code
  • ERb templating system (the same one of Ruby on Rails fame) allows you to generate complex configuration files with per-host settings

Ask anyone who manages big numbers of systems - hundreds, or thousands - and they'll tell you that the ability to automatically manage systems from provision to decommission without manual intervention is absolutely essential. Whether it's built in, like GPO in Windows, or an add-on package like Puppet, trying to manage any more than one or two systems without this kind of help is just making more work for yourself.

And that's not very lazy at all.