WARNING: This article has been judged boring, and will at some point be rewritten.
How can we tell if two texts are the same?
Compare them letter by letter (space is a letter) and see if they are all the same.
Is it possible to measure the distance between them if they are not?
A mathematician's first answer would be a metric on letter differences.
So "arfle" is 1 away from "arfly".
That's a bit rubbish.
We'd like "The cat sat on the mat" to be similar to "A cat sat on the mat".
Those two phrases are so close in meaning that there are languages where it's hard to express the differences. But the first letters are different (AT), and the second( h), and the third (ec), and the fourth.
So we might try something like an editing metric. How many operations does it take to turn one sentence into another. So saying "Swap cat for mat" is a distance of one. Insert "green" is a distance of one, etc.
Latin would translate them both as something like: "Catus mate satavit". Which can be interpreted as "The verb sat takes cat as its subject and mat as its indirect object, in time past and the action is completed.
Both English phrases are valid translations of the Latin phrase, and indeed both are as close as you can get to the original.
It is also possible to say in Latin "mate catus satavit". This means almost exactly the same thing, but there's a slight feeling of emphasis on the mat.
We can't do the same thing in English, "The mat sat on the cat" isn't what we meant at all. It would have to be "It was the mat that the cat sat on".
So we've made a small change to the Latin (one step). But in the English we've moved "the" and "mat" and added "It" "was" and "that".
phrase A -> translate -> phrase B -> move 1 -> phrase C -> translate -> phrase D
Alternatively phrase A -> move 2, add 3 -> phrase D.
It looks like whatever editing metrics we choose we're going to have them be unstable under translation.
And yet the four phrases are obviously very close in meaning.
You can imagine a conversation where Mulder says "The cat sat on the sofa", and Scully replies "The cat sat on the mat.". And one where she says "It was the mat that the cat sat on." And the information communicated is no different.
Perhaps there's another attack?
We could say "The difference between two texts is defined by the reader". Texts don't have meanings by themselves. It takes a conscious mind to measure the difference between two things.
OK, so now what is a good translation? By good I mean "conveys the same meaning".
Well, you need to take someone who can speak both languages, and ask them if the two things mean the same to them.
I'm a good speaker of English, and a poor speaker of French.
My interpretation of French is different because I am an English speaker.
When I hear the French word "boulingrin", which means lawn, I cannot help but imagine a lawn which is like a bowling green.
When I hear "C'est ma femme, actuellement". I hear "This is my wife, actually", and only later do I hear "This is my current wife", which is what the French phrase really means. "This is my wife, actually" would be "C'est ma femme, en fait".
So you can't ask me whether a French text and an English text mean the same thing. Because a French text doesn't mean the same thing to me as to a Frenchman.
Neither can you ask a good French speaker who is also a poor English speaker, for the same reason.
Neither can you ask a true bilingual, I would imagine. Because in every language there are clusters of concepts which are separate in other languages.
In English "Free Software" means software that is unrestricted, but it also means software that costs nothing.
In French, these two concepts "Software libre" and "Software gratuit" are completely separate and there is no easy way to conflate them.
It is not that the distinction is meaningless, which explains why English speaking Free Software loonies say things like "Free as in speech, not as in beer".
And in fact the phrase "Software libre" is taking off in English as a shorthand way of expressing an important difference in meaning that English has no easy way to express.
Notice that the problem is in the English language. English brains can hold the concept of "software libre" even before they hear the phrase.
Similarly in French "J'ai raison" means "I am right". But it also means "I am correct".
In English these concepts are quite distinct. But if you're French, you can't be wrong without your reasoning being incorrect.
Don't get me wrong. There are ways to say in French "I am morally right" and "I am correct". It is just that they're not expressed normally. As in free software being the usual term as opposed to 'liberated software' or 'costless software'. They're artificial-sounding circumlocutions.
This idea of rightness being the same thing as correctness is a very fundamental philosophical difference! France and England have different legal systems and ways of thinking about things, but I think that if the two concepts weren't as firmly separated in the French brain as they are in the English, then the differences would be very much greater. Indeed I suspect that one of the differences would be that France would have ceased to exist. There'd be too many religious wars.
But when you speak in French, you destroy the difference as the information gets passed from mind to mind, and who knows whether it gets reconstructed properly?
What about if the two languages you're translating between are the same language at different times?
When I was a boy, I was worried about the fact that Shakespeare's comedies aren't funny. There are funny bits, to be sure. But the funniest bit I can think of is the porter scene, in Macbeth. Which isn't one of the comedies.
My father told me that in Shakespeare's time, "Comedy" meant something different to what it does now. They would have called our comedies "Farces".
He told me to look, not for humour in Shakespeare, but for poetry and truth.
I was puzzled by this answer, since that seemed to leave Comedy meaning "Any play which is not a tragedy or a farce". Also it meant that Shakespeare then, as he is now, would be a very elevated sort of taste.
Not many people are going to pay to watch a TV program that they have to study in order to understand.
I don't think it's the right answer. Shakespeare's Globe theatre was on the wrong side of the river. Amongst the prostitutes and bear baiters of Elizabethan Southwark. About the nastiest place in England.
And he was a popular entertainer. He made himself rich and famous with his plays. People flocked to them, and contemporary reports have them rolling in the aisles.
Perhaps the comedies were funny. Humour fades so fast. Even the immortal Monty Python seems a little pale now. You'd need to study to understand why the ministry of silly walks is subversive, or why the army's effete drilling is so shocking. Or who the Pirhana brothers are.
I can still remember why these were hideously, screamingly, achingly funny things. I can't imagine that modern kids understand at all, even when it's been explained.
In four hundred years time, I can well imagine a father telling his son to look for the poetry and truth in Monty Python. Except that there isn't any, so they'll probably be about as well known as Shakespeare's great commercial rivals, X and Y.
It does make me wonder, if they're still good enough to be considered the best things ever written in English, how good they actually were when they were new.
But at the time they were just disposable entertainment. It's a happy accident that some friends of Shakespeare liked them enough that after he was dead they gathered up some of the old scripts and published them.
So now it looks like we might need an English to English translation process.
And what about dialects? Where I grew up, people would say "Can tha borrow us twenty pence till tomorrow?".
That's so meaningless in the Standard English of the South that the meaning is immediately obvious!
Some people deal with this by saying that the Northern version is "wrong". These people need to explain what they mean by "right". Notice that the Northern phrase uses "tha", the singular you form that Shakespeare also uses. Is that "right" or "wrong". Maybe "right" is time dependent, and Shakespeare gets to use thee but moderns don't?
What about "Our father, which art in Heaven, hallowed be thy name?". Surely that's Standard English. It's old, but people say it every day. And even modern religious men call God Thou instead of You in their prayers. Is is right or wrong?
What about "to boldly go?" What about "That is a rule up with which I refuse to put?" (The rule up with which Churchill refused to put was never to end a sentence a preposition with).
What if all the people in the South of England were to die of some guacamole-related disease which didn't make the jump to mushy peas?
What would be right then? After all the reason that Elizabethan English isn't modern English is that the Elizabethans are all dead.
Anyway. It looks like we need a way of translating from English to English.
Almost certainly, we need a way of translating from one person's English to another person's English.
But we're not done yet. As you age, you learn new things.
If I say "The best way to handle an exception is to throw a trampoline", I'm making a point about computer science. If my fifteen year old self had said this, then he'd have meant that when you run into weirdos you should throw sporting equipment.
These two perfectly comprehensible sentiments have very different translations into French, but they probably have very different translations into the English of almost everyone else in the world. Including me at thirty.
So, to recap.
How can we tell if two texts are the same? How can we tell how different they are?
It's all subjective.
Two texts that are absolutely identical can have different meanings to different people.
Two texts that are different can have the same meaning to one person.
Two texts that are arbitrarily close can have arbitrarily different meanings.
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo.
The king of France is bald.
The problem with this is that it is all horse-shit.
Of course one thing can be a translation of another thing. Of course you can read a document in French that doesn't tell you anything you don't already know from reading a document in English.
I have read Alexandre Dumas and Fritz Lieber and Douglas Adams and Edgar Allen Poe and Asterix and the Bible in English and in French.
The books are recognisably, obviously the same books. There are little differences that make it a pleasure to compare them. There are mistranslations, there are places where one language can't express the thought in the other language without going all round the houses, and thus introducing new ideas, which weren't in the original.
But it's nothing serious.
Or is it?
Dumas and Lieber and Adams are both better in their own language. The translations are poor shadows. The joy is gone.
Edgar Allen Poe (translated by Baudelaire, no less) is much better in French!
The Bible is flat and dull in French (as the Vulgate is in Latin). The English Bible (Authorised Version) is one of the most beautiful books in the language. I can't imagine a non-religious Frenchman loving the Bible.
On the other hand, there are lots of really tedious English bibles. They often have the word 'modern' in their titles. But the content is always exactly the same.
Accurate translation matters to these people a lot. Accurate translation mattered very much more to the people who translated the King James.
Asterix is hugely funny in English and in French, but the jokes are different!
When Asterix goes to visit his British cousin, he remarks that he likes his cousin's tweeds, and asks if the fabric is expensive.
The cousin replies that his tailor is rich.
This is funny in English because of the understatement. I thought it was hilarious when I was a little boy.
It is the same sort of thing that is going on when someone rants on for twenty minutes about how bad a restaurant is, and someone else says "So you wouldn't recommend it then?".
It is a form of humour that only the British understand. Even the Americans, who speak what is recognizably a subset of English, don't really get it.
In French it is funny because "My tailor is rich" is a famous useless phrase from a post-war English course that every French schoolchild once knew, like "The pen of my aunt is larger than the garden of my uncle" or "My postillion has been struck by lightning" or "My father was a war profiteer and made all his money on the black market" (seriously, my father has a German phrase book with this in).
An absolutely literal translation of the same phrase means two completely different things.
And if you read it now, you'll see both meanings at once. So I've just changed the meaning of a pre-existing text for you.
What on earth is going on?