When is a dash not a dash?
An interesting an unusually frustrating pitfall that I've stumbled into a few times:
Many times when we're preparing for deployment of a software product, we find ourselves preparing a Word document with the steps for installation. In the case of my current system, installing the server requires several manual steps, among them running a command-line utility to set some Kerberos security parameters.
Naturally, being the friendly and considerate developers we are, we supplied the full command-line that is to be run on the target machine, complete will all options, parameters and variables.
Imagine our surprise, then, to learn that the command line simply fails to run, claiming a syntax error in the inputs. Even when we went over to the system to see it for ourselves, we saw that a direct copy-paste operation from the Word document to the commandline fails mysteriously. What is even stranger is that typing out the command-line, letter by letter, will cause the command to succeed.
The answer, as you might guess by the title of my post, is that Word tends to be a bit too smart for its own good. When I copied the command-line to the document with a option parameter like this:
setspn.exe -a http/MyServerName.domain.com
it replaced the dash in "-a" automatically with this:
setspn.exe –a http/MyServerName.domain.com
Similar at first glance, but what we have on the first line is a Hypen-Minus (U+002D) and on the second an En Dash (U+2013). Word's grammar rules changes the minus-sign to a Dash, and the command-line parser chokes on it and dies. This is almost impossible to spot; command-line fonts don't display the difference to any discernable degree. The only way we stumbled onto it is because we tried replacing every other element on the command-line with no success.
So remember - by default Word allows itself freedom in altering your documents. For letter-perfect preservation, consider disabling AutoText/AutoCorrect in word, or just storing syntax-sensitive data in normal text files.
6 Comments
Comments have been disabled for this content.
Mike said
Good point
I have been caught many times with their "smart" quotes in code but never with this
AndrewSeven said
Word seems to frequently suffer from "A little knowledge is a dangerous thing".
When you need fidelity in text content, don't use MS Word.
Guy Murphy said
It's a big problem for anybody working with XML where text might have originated in Word. There's a handful of "clever" things it does which are a nightmare when you put the text into XML.
Avner Kashtan said
Andrew: Fidelity! Exactly the word I was looking for. Thanks.
Mike/Guy: Indeed, XML is a stickler for syntax, and a case of smart-quotes or a character that doesn't suit its encoding can send you going byte-by-byte looking for the culprit.
Minecraft said
It's interesting to see this point of view. I can't say fore sure if I agree or not, but it is something I will think about now.
Download oem software said
PmWraa Honestly, not bad news!....