14

I think the world now programs in English-based programming languages not only because of historical/economic circumstances, but because the English morphology in particular has some properties that suit algorithmic thinking best. But anyway it would be interesting to hear your opinions on this, especially if you are multilingual yourself.

I've seen some mentioning of German-based languages (see Plankalkul for example, in fact the first ever programming language we know very little about, thanks to WW2), also a Russian-based flavor of Algol which existed back in the 80's at least on paper, not sure if it ever existed in binary or not. Both looked a bit sluggish because there were more shortened words and weird abbreviations than full words like in the EN-based languages. So if you know of any other national language-based PL's, even completely archaic and irrelevant today, purely theoretical or whatever, would be interesting to take a look at them.

And back to the main question: so what, if any, makes the Shakespeare's language so good for programming?

(There is actually a list of Non-English-based programming languages on Wikipedia (of course, where else?), but it would be interesting to hear opinions of native speakers of those languages on how a given "national" programming languages really feels like.)

mojuba
  • 5,713

8 Answers8

18

Disclaimer: My native language is German.

I don't think there is any reason English as a language to take keywords from would be better than any other natural language. I do think it's the one all-important language in IT, but not because of linguistic properties, but because most tech people speak it to some degree, it's the native tounge of quite a few important people in the field, most tech-related terms are already English, etc.

But since we talk about programming languages, not about documentation/API/names/etc, I have to object: Programming languages are not based on English - or on any other natural language, for that matter. Programming languages are formal languages. They do use, to varying degree, a handful of words from (usually) English. Some even try to mimic its grammar -- but utterly fail to read like English regardless. To add insult to injury, they only associate one single (in rare cases a handful of) meaning(s) with each word they borrow. Often, this meaning is very jargon-y, specialized, or based on a questionable analogy. Therefore, knowing the myriad natural-language meanings of a word borrowed by programming language doesn't really help understanding the programming concept behind the keyword. Examples off the top of my head: array, type, goto, class, void. (Fun fact that sprung to mind as I re-read the question: All of these, except goto, have German translations which are at most one character longer: Feld, Typ, Klasse, Leere. They all sound weird to me, but that's probably a matter of habit.)

8

English is the lingua franca language of programming.

From the same article:

It's nothing more than great hackers collectively realizing that sticking to English for technical discussion makes it easier to get stuff done.

Matt Ball
  • 457
5

The English language is favorable because:

  1. Better fits the restrictions of current peripherals.

Ease of type. You can use a standard keyboard. I know this sounds like "llama dung", but have you tried typing in Chinese? There are 1000s of characters and since Chinese doesn't have an adequate "character" building technique to fit the concept of a keyboard, it wouldn't be easy to learn for a global audience.

  1. Can be morphed into non-words that are equally recognizable. English favors abbreviations due to the lack of accent symbols.

Shortened English words are recognizable symbols. One doesn't have to learn the entire English language to code, thus people external to the language can learn fast.

  1. And consider that common programming languages feature more mathematical symbols and less words.

Assembly used small words that had no sentence structure. Then came languages, like COBOL and FORTRAN, which attempted to accommodate the English sentence structure as much as possible. Newer languages implemented more reliance on universal algebraic symbols, because they had better predictability. (In COBOL Add X To Y, Subtract Y From X, Compute Y = X + A; Compute makes the previous statements unnecessary and reduces language parsing complexity). It wouldn't take much more for me to consider languages like C++ to be more symbolic than language based. There's a little bit of a return to word based programming with C#, but that's mostly to have baked in support for popular programming patterns.

Conclusion:

Ultimately, the peripherals limit to a character based language (like English). Also, western languages have better support for mathematical concepts (like the concept of 0, among others; China borrows numerals in lieu of their own representation of numeric values, to better convey numbers, because it's shorter to write (on average)). Other than numerical values, I would see symbol based languages (Chinese) as better suited to programming language morphology than English, as most of the modern languages already use symbols, and it would be universally equivalent to learn. However, we'd have to impose C++ like structure, having blocks of symbols would not be easy to read for most people in the world.

5

The only reason that english is widely used in computing is that it happens to be a wide spread language right now.

If computers were invented 2000 years ago, they would have used Greek. If they were invented 200 years ago they would have used French. If they would have been invented in 200 years they would probably use Chinese...

Guffa
  • 3,010
3

Here are some benefits to a hypothetical programming language would have if it just assumed the English - Latin alphabet.

  1. It's a small character set (unlike say Kanji)
  2. Doesn't usually use diacritical marks (unlike French, Spanish, German, etc.)
  3. Every Upper case has a lower case (unlike the German Eszett)
  4. It's collation is straight forward

All of these things are problems that still haven't properly been solved on all devices. For example Song titles with diacritical marks don't show up correctly on a number music players

Conrad Frix
  • 4,165
  • 2
  • 20
  • 34
2

I'm not so sure that programming languages themselves benefit from being based on English. In explanation:

  • The name of a methods, variables, objects, etc. don't matter to the computer
  • The content of a comment doesn't matter to a computer
  • The logic of programming is expressible in most (and I assume ALL) spoken languages.

So, if English is of benefit to programming languages, it would be in helping to cause more people to use the programming language. In that regard, here are a few thoughts:

  • Many people in foreign countries learn English whether or not they want to program
  • People designing a computer language and wanting many people to use it will generally pick the most well-known spoken language to describe it.

Summarizing these thoughts, I don't think English really helps programming languages in any significant way that most other languages could offer - other than that many people speak it.

John Fisher
  • 1,795
1

In my opinion, English simply has a richer technical and mathematical vocabulary than many (but not all) other languages. The languages that lack such vocabulary use English loan words to get the job done. This alone is a compelling reason to orient programming languages toward English.

Regarding the languages that do have a sufficiently rich vocabulary to describe everything we need to describe without resorting constantly to English loan words, the tradition of English as the lingua franca (common tongue) for the sciences is in itself compelling, but our alphabet gives us another little leg up:

  • English can be represented in a smaller character set than, for example, Chinese, Japanese, Cyrillic, or even Romance languages that use accented Latin characters.
  • The English alphabet, largely due to its complete lack of accented characters, is very visually clear. We have enough bugs due to mismatched brackets or missing semicolons, it would be foolish to add problems discerning between 'ē', 'ĕ', and 'ě'.
HedgeMage
  • 4,303
1

For some fun reading on the context of language and how we end up groking things:

Stuff of Thought by Steven Pinker

Remember, we're talking about the language construct, not how we communicate the information (not one in the same), I've worked with code where the main language for variables was all in German (the code was no less easy to understand). English doesn't inherently have anything that suits it better for programming, if we're going directly off of how our language is structured it is probably worse not better, and this could honestly be for many reasons:

  1. Lack of structure (we can put subject/predicate/nouns/adjectives wherever we please),
  2. Mutability of our words (we feel like we can abbreviate ANYTHING)
  3. BIG ONE: Someone who knows English fluently won't have any better understanding of a section of code than someone who has no understanding of the English language.

Asking why programming languages use "English" is like asking why the periodic table still has the letter 'W' identify Tungsten, most people can't tell you why unless they know the history. And if you want the history of programming languages we should go back to punch cards, byte instructions, and assembly.

Assembly has no major "English" constructs but it's as close as you can get to machine code without hating yourself. Further, all structural elements of higher level languages can and are regularly implemented by those of us crazy enough to enjoy it. LD, MV, ST, BRA, and the rest of the instruction set look nothing like English, but I can read it perfectly and get the full meaning.

We assign the same meaning of the LD or MV in assembly to higher level constructs, I don't need to know what a variable means, and in many cases won't if it's in English anyway because of #2 in my list. The set of identifiers like int, str, enum, and such are a way of telling what you're working with, no more. If instead of int the identifier was seagull we'd all know what seagull meant in a coding context, not because it's English, but that's what the identifier covers.

TL;DR: Programming languages, like any language need training to understand. The reason their commands are in English instead of Spanish or German or Russian is more than likely esoteric and historical than by some necessary construct of the English language being more or less suited for the identifiers in the Formal Language construct.