Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That is a really bad example, because English does have fairly productive pronunciation rules [1], and trying to make 'fish' come out of ghoti requires breaking them. 'gh' only occurs as an /f/ sound when it occurs at the end of a syllable; as an initial consonant cluster, it's invariably /g/. Turning 'ti' to /ʃ/ is a fairly normal affricatization, which requires a subsequent vowel, which is lacking here (consider words like 'ratio', 'gracious', or 'nation'). Even turning the 'o' into /ɪ/ relies on fairly regular vowel destressing, which there's no reason to expect in 'ghoti'--which should be pronounced per English rules, pretty unambiguously, like goatee.

There are some real issues with English spelling, like the inconsistency of pronouncing 'ea' as /i/ or /ɛ/ (consider, uh, read and read). But 'ghoti' isn't one of them, because that's a case where there's not a lot of ambiguity in English pronunciation.

[1] The worst offenders in English pronunciation are when English borrows foreign words both with foreign pronunciations and foreign spellings.



It has become a thing where folks are taught, basically, that English is not a phonetic language. It is truly mind boggling the number of college educated folks I've talked with that start to try and argue that we don't have a phonetic alphabet.

And, like, I get it. We don't have a fully regular one. But this is like the people that think we don't have a single word to describe some things, when they have to basically ignore adjectives and many many synonyms to get to that idea.

Even better when folks complain that we have different ways to refer to people from other nations. Ignoring that a large part of that is that we heavily deferred to how said people wanted to be referred to.


At least one really obvious way to know that English is a phonetic language: fantasy authors create all sorts of made up names in their books. Sure, sometimes there are disagreements over how to pronounce these names, but generally readers come up with quite similar pronunciations.


The confusion may come from the various spelling conventions in the numerous loan words. In many of the counterintuitive cases, you could imagine a more phonetic spelling. The tradition has been to preserve buffet as is, instead of rewriting it as, "buffay".

The distinction is there. English can be used phonetically. We prefer to preserve the heritage of various loan words instead.


Eh. Only sometimes.

Hearing Americans pronounce the French loanword 'niche' as 'nitch' instead of 'neesh' is cringe-inducing.

English pronunciation is just kind of a mess (especially in the US). It is one of the few languages where highly educated mature people are regularly unsure of how to pronounce a word in their own language or where there is no agreed upon 'non-dialect'/standard pronunciation.


Some Americans clearly must do this, but personally, I've never heard this in my life until I saw it on a YouTube video of a British person complaining how Americans pronounce words. Obviously, your experience may vary - it's a big country.


The transatlantic dispute over "aluminum/aluminium" seems minor when you consider how English is used globally. Even within Britain, there are considerable variations.

https://en.wikipedia.org/wiki/Anglosphere#/media/File:Anglos...


The one that gets me, as an American is nuclear vs nucular. Both have been in use verbally and written for decades... academics have adopted the former, even if the latter was more common in most early use. And that's just one, pretty recent example.


And as a British speaker of English it amuses me that we say "fillit" steak when Americans (afaik) correctly say "fillay". There are others, but I guess there are more 'correct' British English pronunciations of words with origins like this than there are American English.


>It is one of the few languages where highly educated mature people are regularly unsure of how to pronounce a word in their own language

Which is worse, being unable to correctly pronounce a word (but still being close enough to be understandable) or being completely unable to write a word?

https://globalchinapulse.net/character-amnesia-in-china/


...we all agree that the right pronunciation of "nitch" is "neesh", though, or at least I've never heard a serious argument to the contrary. People just genuinely don't know how to pronounce it because they've only seen it written.

One that still gets me personally is "hyperbole"--I know how it's pronounced but when I read it, I still say "hyper-bowl" in my head more often than not. I don't think I've ever made the mistake while reading out loud to someone yet, but it will likely happen some day and when it does I will feel very stupid.


> I've never heard a serious argument to the contrary.

Well, here you go: https://www.merriam-webster.com/dictionary/niche#did-you-kno...

> I still say "hyper-bowl" in my head more often than not.

Same. This is where diacritics would fix the problem: Hyperbolé. Although hyperbolee would also work, of course.


I'd argue that is mostly because 1) people follow audiobook or TV series pronunciations and 2) most discussions happen online and not in verbal form.

This is definitely a problem when it surfaces. For example the Stormlight Archive [1] series has two voice actors narrating the audiobook, and they don't even agree between them how to pronounce half the made up names.

[1] https://en.wikipedia.org/wiki/The_Stormlight_Archive


As someone who has listened to The Stormlight Archive (and The Wheel of Time with the same two narrators), the differences are absolutely there, but they're relatively small.

Fantasy novels predate the widespread popularity of audiobooks. It used to be quite expensive to distribute a large enough volume of audio. The old "books on tape" cost a lot of money, were frequently abridged, and only existed for the most popular titles.


"It's pronounced Jandalf!"


Reminiscent of a tweet about the death of the inventor of the GIF, who reportedly said it should be pronounced "jif" — the retweeter's comment was, "I guess he's with Jod."

https://twitter.com/andylevy/status/1506748105735159818 (not there anymore; maybe the account holder ditched Twitter)


I don’t think he could be taken seriously with a name like Yandalf.


cue GIF pronunciation war.


It's not pronounced "jraphics interchange format", therefore hard G


And yet it is gif like in gin and giraffe...


And yet there's also girl, gift, gimp, gill, gibbon, and giggle.


But but but the creator himself said it is gif like in gin and giraffe... right?

TIL: gimp is gimp and not gimp? I always pronounced this like gin.


> But but but the creator himself said it is gif like in gin and giraffe... right?

Yeah, that's what the creator said, and that's actually how I pronounce it, too. Just pointing out that "gi-" words can have both hard and soft Gs.

> TIL: gimp is gimp and not gimp? I always pronounced this like gin.

You learn something new every day!


> English is not a phonetic language.

Whoever says that English is a phonetic language does not know what a phonetic language is.

The property that characterizes a phonetic language is that you can properly pronounce a written word that you know nothing about.


English is more phonetic than not. There are a lot of words where it isn't clear what is the correct pronunciation, but if you put a random sequence of letters together there are only a few possible pronunciations, often exactly one.

I wish English was more phonetic. Spelling and pronunciations is a mess. However the language is mostly phonetic.


There's something you speakers of non-phonetic languages cannot fully grasp, I'm afraid!

We Italians, when we were children, we were taught to read based on the written letters, and we were able to read any word. It was normal, during primary school, to pronounce a word correctly and then ask the teacher what it meant. This is something you can not do in English.

And the converse was true as well! An Italian child is able to hear the surname of a new acquaintance, or the name of the village they are from, and write it down properly. In Italian, the question "How do you spell it?" does not make any sense! Again, this is something you can not do in English. Nor can you do it in French, because in the past centuries they had ink to spare and as such they started writing down useless letters that they do not pronounce.


> We Italians, when we were children, we were taught to read based on the written letters, and we were able to read any word. It was normal, during primary school, to pronounce a word correctly and then ask the teacher what it meant. This is something you can not do in English.

We're still taught very basic phonetic rules in English. Like how vowels have a long sound and a short sound, where "ee" is the long e sound, or "<vowel> <consonant> e" triggers the long sound for that vowel. But you're also taught that many words are exceptions (e.g. bear vs beard). And you learn there are patterns to the exceptions, like how "ea," if it doesn't sound like "ee," will sound like a short e, like in "head" or "breadth," and particularly in cases like "dream - dreamt" or "leap - leapt."

And if you do a lot of reading as a kid, you vaguely recognize in the back of your mind some words that seem to follow a different set of pronunciation rules not taught in school (e.g. rouge, mirage, entourage, entrée, matinée, parfait, buffet, memoir, soirée, patois), which you learn implicitly. I remember this as a kid, only later learning those were French.

And this lets you guess pretty well how you'd pronounce a word. Just with basic rules and a lot of input to learn from, you can guess how to pronounce pretty much anything with good accuracy, because there are rules, and even a logic to the exceptions, but the rules are overlapping, so it's more like a set of rules you choose from.

I'd liken it to machine learning. You can learn the rules without even being taught the rules, like I did in the case of French loan words. And there are probably rules we follow without even realizing it, just instinctively thinking it's the natural way to pronounce the word without knowing why.

I'm not saying it's as good as being as phonetic as Italian, but it's not like we just have to memorize the pronunciation and spelling of every word as though it were a structureless string of letters and a corresponding, unrelated sound.

Sorry for the long comment.


You can frequently do that in English too. Of course there are exceptions, but if anything it's typically because of words/names from other languages.

In my experience learning Spanish, their loan words are Spanish-ized, being made to be pronounced and spelled in a format that makes more sense in Spanish. Whereas in English, the pronunciation and spelling is usually taken more directly from the source, so you get a bunch of instances where a word's spelling doesn't really match its pronunciation.


Yes, but Italy had to centralize its language in order to accomplish this. 1000 Italian dialects were suppressed in a very heavyweight process. (And probably some people didn't like speaking Florentine, which became modern Italian.)

English is complicated because it's decentralized and there is no authority to regularize it. Which is a feature, not a bug.


You are wrong on several levels.

1 - Being fluent in the national language does not prevent people from maintaining their dialects in parallel.

2 - Whether a language is phonetic has no relation to political issues concerning dialects.

3 - Whether a language is phonetic has no relation to whether people like to use it.

4 - English got decentralized starting with the Age of Sail, but the lack of correspondence between written and oral forms is systemic and older than that.


> English got decentralized starting with the Age of Sail, but the lack of correspondence between written and oral forms is systemic and older than that.

That's not really true -- there is and was a great deal of dialect diversity within England itself. It was widespread printing that allowed languages to be standardized at the scale of nation-states in the first place: the divergences that developed after the age of sail were reversing convergence that had only begun a couple of hundred years earlier.

And although versions of English from the south and east of England became the basis for modern standard English, other dialects persisted and sometimes spread around the world, so some of the differences between English dialects globally are due to disparate influences from different dialects originating within the British Isles.


being fluent in a language makes you less likely to be interested in a second when everyone speaks the first. This plays out over generations in killing the less common languages.


There is a still a lot more linguistic diversity in Italy than across the entire English speaking world.

e.g. Northern Italian languages are technically more closely related to Gallo-Romance languages from the other side of the Alps than to standard Italian.


I think you're trying to to argue something like: "the set of dialects that make up English have a large(r?) set of allowable IPA orthographic representations than the accepted set of English orthographies" or something to that effect? And, that, perhaps, Spanish (French? Ukrainian?) have a much smaller set of alternate IPA orthographies for a given acceptable orthography?

I guess I'm really confused. It's not like English is some Arabic language where the orthography is in a second nearly unintelligible languages? Or, Chinese or Egyptian hieroglyphs... ?


> I think you're trying to to argue something like:

I'm arguing exactly what I wrote: a phonetic language is one when you can see a written word and pronounce it correctly, without knowing what it means and without having ever heard it before.

Edit - as an example, consider "door" and "pool": the written form is not sufficient to guess the sound to associate to the double o.


Which language is phonetic? I think you're beating around the bush here; you claim English isn't phonetic, but which language is?


This is something that should be looked up and not argued about. As far as I can remember, the vast majority of alphabetic languages are phonetic. English, French, and Portuguese are not.

Being able to guess how something is pronounced sometimes is not enough to say that English is phonetically spelled. English often borrows spellings directly from the languages that it is borrowing a word from, those spellings are usually phonetic (based on the source language's rules), and due to the presence of certain peculiar sounds, one can often guess which phonetically-spelled language a word was borrowed from. That's not an English word being spelled phonetically, that's people being forced to become language detectives. You can get lucky and guess the pronunciation of a Chinese character that you've never seen before (based on the radicals), but no one would say that Chinese characters are a phonetic alphabet.

Other than the soundalikes "b" = "v" and in Latin America soft "c" and "z" = "s", when Spanish speakers don't know how to spell a word, it's because they are also saying the word wrong when they speak.


Spanish and Italian are.


There's also Finnish.


Door and pool are pronounced the same where I am, with a drawn out double o sound. When spoken rapidly, the vowel contracts, especially in door.


The door vowel placed between P and L would make the word 'Paul' or 'pall' in most English accents. If I imagine 'door' with the pool vowel, I get something like a Scottish pronunciation of 'dour'.


dew-r pew-l


blood


Counter point, anyone that claims English isn't a phonetic language doesn't know what a logographic writing system is. Or what a gesture language is.


I'm not stating that English is anything like that. Just that it is not phonetic, in the sense that the written form of a word is not sufficient to pronounce it correctly.


That isn't what that means, though. It is not regular, it is phonetic. Indeed, your argument that there is confusion in spelling is because it is phonetic, but not regular. You know the letters in "glasses" correspond to sounding out something. In contrast to something like an emoji, :glasses:, which you don't.


I have to agree with you. With respect to emojis, English is phonetic. But this statement is as stretched as considering a diesel guzzling truck green because the fuel it burns was indeed created using solar energy.


No it isn't. Pedantically, English the language is definitionally phonetic, as it is spoken. Sign language is not phonetic, nor are things like smoke signals/traffic signals/etc.

Just as it would be silly to claim that Japanese is not phonetic. Of course spoken Japanese is phonetic. They even have two fully regular alphabets that can both express the same phonemes, but are used for different reasons. As well, they have a completely logographic set that does not relate to phonemes, even though it is used for most writing.


We're discussing features of written language ("phonetic" -- or the etymologically related "phonological") is a way of categorizing writing systems by their relationship to spoken language.

> Of course spoken Japanese is phonetic

"Phonetic" is not a feature of spoken language, but of the relation between other language forms (usually, written, but you could make the same distinction for, say, sign languages) and spoken language.

> They even have two fully regular alphabets

I assume from "two fully regular" you are referring to hiragana and katakana, but those are syllabaries, not alphabets. (Romaji is an alphabetic system, though, but I don't know where you'd find a second one.)


Phonetic is absolutely a feature linked to spoken languages, though? It quite literally is relating to spoken sounds. Sign language, for example, is not phonetic, as many users of it cannot speak or hear.

Fair that I should have said they have two phonetic writing systems, decidedly not alphabets. I'm not sure the distinction is one that matters for what we are covering here?


> Phonetic is absolutely a feature linked to spoken languages, though?

It's a feature linked to spoken languages, since it is a feature of the relation of non-spoken (usually written) language to a spoken language.

But it is not a feature of a spoken language.

> Sign language, for example, is not phonetic, as many users of it cannot speak or hear.

Yes, in causal terms, the fact many users of sign languages aren't familiar with the sounds of the spoken language is a reason sign languages tend not be phonetic, but they are not phonetic in definitional terms because the symbols in the sign language do not represent the sounds of spoken language.

But it would make no sense to call a spoken language phonetic (except maybe if it was a code for a different spoken language, in which the phonemes in one mapped to the individual phonemes, rather than ideas, of the other.)


It absolutely is a feature of spoken languages. It is in contrast to vocalizations, specifically because it is about speech and not just the sounds animals can make.

I get what you are aiming at, but phonetics is about speech. Is why you can reliably say how many phonemes different languages have. If you had to cover all vocalizations that people could do, you would have a bit more trouble.


"phonetics" is about speech, but the noun "phonetics" is not the adjective "phonetic" as applied to a language. "phonetic" is not a modifier that applies to spoken language (with the hypothetical caveat I gave upthread), and even if it was, it would have a different definition than the one that applies to non-spoken language and is about the relation such a language has to a spoken language, so trying to redirect to it in a discussion of that use of the adjective "phonetic" would be equivocation, argumentative conflation of different definitions of the same word.


It is hard for me to read this. You seem to have given up on capital letters. And sentences. I don't like criticizing run-on sentences as being indicative of bad thinking; but I do literally feel you grasping here.

I'm largely comfortable with the idea that there is something lacking in the orthography of English. Fully comfortable, even. I'm growing frustrated with how many are pushing the idea that it is not phonetic. The system is literally to convey, in writing, the words that you would speak in English. And the word "phonetic" captures that perfectly.

If you want to argue that we are building a new use of the word "phonetic" applied to writing that supersedes "orthography" and related terms. You do you. It still seems nonsensical to me and only works if you ignore that we have an alphabet that is literally used to convey speech sounds.


The issue at the start of the conversation is not about speaking or gesturing. It is about using the Latin alphabet properly (i.e. phonetically, as it was designed) or "with some imagination" as the English does.


The alphabet is used to communicate the spoken words. Not the concepts or something else, literally the spoken words. Is a big part of why slang is so popular in fiction settings, as they would use the letter to convey pronunciation. Because the letters generally represent phonemes.


> but not regular.

There is "not being regular" and there is "not even trying, and getting it right by a stroke of luck from time to time".


I learned to read phonetically, sounding out the words. It worked very well. No other scheme for learning to read has worked remotely as well.


I think I was too swayed by Sold a Story; but I am heavily convinced that the non phonics based attempts to teach reading was a massive disaster. And not just for reading literature, but also for reading math. Without learning to effectively interact with symbols, people grow to think they either get the math or they don't.


No Professor or "expert" in the Education field ever advanced their own career by advocating for simple & obvious things which actually worked.

/s?


Yeah, English orthography is a hot mess, but it's still fundamentally phonetic and alphabetic. Just try to learn to read Japanese or Chinese, and you'll very quickly come to miss English's pile of nonsense.


> That is a really bad example, because English does have fairly productive pronunciation rules

Not really. There's no way to guess how many english words are pronounced based on the written form, unless you've heard it before. And of course the pronunciation may vary wildly based on region/country as well.

The most telling evidence of this is the existence of Spelling Bee competitions in english language countries. The fact that hearing a word being spoken is challenging enough to figure out how it is written that it is a competitive sport, says it all.

There are many languages where the concept of a spelling bee competition makes no sense at all, because as soon as you hear the word being spoken, it is 100% deterministically obvious how it is written. English, not so much.

But, french is much worse!


According to this paper [0] and my own experience, it's way easier to pronounce a word in French given the spelled word than in English. It's slightly harder to spell French than English for the model of the study, but it's really close. Now, in my personal experience, I feel like French has a lot of rules while English has a lot of outliers which do not follow any rules. But my native language is French, so I am obviously biased.

[0]: https://arxiv.org/pdf/1912.13321


Yeah as far as I know, in French words are always pronounced consistent with how they're spelled. The same is not true in English. Americans complain a lot about french spellings '-ioux', 'eau', etc. but they offer no gripe over the difference between '-ough' in 'enough' vs 'through'.

French is funny to me because the written language and the spoken language are in some ways quite different, with written french introducing considerable complexity. aller, allait, allais, allaient, alleé, etc. Since the spoken context for all the conjugations is almost always clear, I'm not sure why someone introduced the extra complexity.


> Yeah as far as I know, in French words are always pronounced consistent with how they're spelled.

It's far from as bad as English, but here's a Reddit thread with lots of French words which are not spelt as they are written. Not esoteric words either; along the lines of hier and monsieur

https://www.reddit.com/r/French/comments/1269a2x/is_there_a_...


> Yeah as far as I know, in French words are always pronounced consistent with how they're spelled.

Whoa, very much not! I have spent the last 20 years trying to learn how to pronounce french words (my partner is a native french speaker, so I keep trying). The only somewhat consistent pattern I have is that the last few letters of each word are often silent, but even that is not really always consistent.

I'm fluent in 4 languages but french is an impossibly tough nut to crack for me.


I disagree. For whatever reason, most proficient readers I know have an intuition about the correct pronunciation of a word even if they’ve never heard it spoken before. And even if they use an intuitive pronunciation that isn’t identical to standard pronunciation, they’ll still be understood.

Spelling bee is the opposite direction, going from pronunciation to spelling; not a fair comparison.


> For whatever reason, most proficient readers I know have an intuition about the correct pronunciation of a word even if they’ve never heard it spoken before.

Because pronunciation rules exist, they're just never explicitly taught and instead learned through exposure. For example, here's someone reconstructing as many of the rules as they can: https://www.zompist.com/spell.html


South Slavic languages have 1-1 mappings thanks to an engineer in disguise [1]

https://en.wikipedia.org/wiki/Vuk_Karad%C5%BEi%C4%87#Linguis...


It's really not that hard. Some other examples of the same include Belarusian (albeit with Cyrillic alphabet) and Finnish.


Finnish is extremely easy, there is one sound for each letter and zero exceptions.

Spanish is also very predictable. While there are a few exceptions (like 'c' can be 'c' or 's'), they are very easy rules to follow, so never any surprises.

English and French are in the batshit crazy category. It's pretty much all random, you just have to know from memorization.


French is very hard to write correctly, but it's pretty easy to read - while rules are weird, they are also quite consistent.

English is hard to both read and write.


> But, french is much worse!

Nah. Having learned both, French is easier in this regard. It is not as random, it has rules they work most of the time.


French has much stricter rules, but I could see how the abundance of silent letters would make a spelling bee harder.


French also has some weird gotchas, e.g. "la démocratie" where the spelling represents the word's root rather than pronunciation.


An even worse example is the last name "de Broglie", which I think most French natives would likely get wrong

https://www.youtube.com/watch?v=k45IZDkg2Pg


Not nearly as bad as the English pronouncing, say, the name Cholmondeley :-)


I would expect that spelling bees would select words that are not phonetically spelled. This selection bias does not imply that English does not have productive pronunciation rules.


True, in that spelling bees will select for harder words.

But the fact that such words exist, in such large quantities that memorizing them all is so challenging that this becomes a competitive sport, is why engligh is so impossible.


Dutch, which has a pretty reasonable sound-to-orthography mapping (some exceptions of course, but not all that many) also has spelling bees. Often won by the Belgians.


> Not really. There's no way to guess how many english words are pronounced based on the written form, unless you've heard it before. And of course the pronunciation may vary wildly based on region/country as well.

> The most telling evidence of this is the existence of Spelling Bee competitions in english language countries. The fact that hearing a word being spoken is challenging enough to figure out how it is written that it is a competitive sport, says it all.

That's two exact opposite things.

Languages for which you know how to pronounce a word just from its written form => you can have spelling bee competition there.

Languages for which you know how to write a word when you hear it pronounced => no spelling bee competition.

I'll take French as an example : if you see "o", "au", "eau" in a word you know how to pronounce it. There is one and only way. But if you hear "o" in a word then good luck knowing how to write it. So you got dictées (spelling bees) even if you can easily guess how a written word sounds like. The existence of spelling bee competition in the English world is not proof that the language written word pronunciation are a guess.


As a spanish I could say the most challenging part of english is the lack of consistency between how you write something and how you pronounce it.

Spanish is totally systematic in this sense and once you can read it, you can pronounce it.

English is a bit messy regarding to this, for whatever reasons.


Portuguese and German are like that.

You’ve never seen the word before, but when reading it for the first time, you’ll probably pronounce it correctly.

English is awful, but French takes the crown on this one—though more because it has the same pronunciation for many different words and written forms.

English, on the other hand, the alphabet doesn’t map well.

Mood and flood both have “oo”, yet each is pronounced differently. You need to know the word beforehand to know exactly how it’s pronounced.


Or live and live, read and read (past participle), or castle (the t is mute) or bear, beard, the ea is different.

I do not want to be offensive, there are lots more , but it is an amazing sh*tshow the mapping.


If you think castle is bad, wait till you hear forecastle (“fok-sul”)


At least that's often spelled "fo'c'sle" these days, which gives you a good idea of the actual pronunciation.

My personal favorite in English is "colonel" being pronounced the same as "kernel". Which is insane even from an etymological perspective because the word is a derivative of "column" (as in, a colonel is someone who commands/leads a column of soldiers).


Yes, an incredibly rare use of double apostrophes in English! More uncommonly you'll see bo's'n as well, for boatswain.


Hahaha. I was not aware of that one. Yes, looks like undecipherable.


A lot of nautical terms have unusual pronunciations. English sailors primarily came from coastal regions, and were very happy to have a lingo that was incomprehensible to the landsman. All of this carried over to North America as well.


its just elision. "four-cassle" vs "fo'k'sul."


French is a lot less bad than English in this regard. In French you can usually (though not always) predict how a word is pronounced from its spelling, but not vice versa. In English, both directions are impossible.


French is not a good example. Pronunciation often deviates from spelling in French (e.g. many silent letters and inconsistent mappings).

Hungarian, however, is pronounced the way it is written, as its orthographic type is phonemic, whereas French and English are of type deep orthography.

Serbian is of the perfectly phonemic type. "Write as you speak, read as it is written" is a common saying.


The silent letters are not the point - that's why the poster you replied to said it doesn't work speech->writing in French. But writing->speech is much, much more consistent than in English, even if the orthography itself is kinda criminal with all the silent letters and whatnot.


I am inclined to agree.


I've only really been exposed to French in music, where I've sung various French pieces of the years. But from my experience, at least French is consistent? As-written is as-pronounced.

Is this not really the case, and therefore is French also guilty of having the same vowels/consonants pronounced differently for completely arb reasons?


Fritteuse.


My son's first year teacher said (I may have the numbers slightly wrong) that Spanish has 23 phonemes (sounds the mouth makes) and 23 graphemes (ways to write sounds). English, on the other hand, has 43 phonemes and over 500 graphemes.


Spanish is better than English, but it's nowhere near that regular. There are three different ways to pronounce "x", wild dialectal variations in "ll" and "c", etc.


The rules are very clear on when those are used though, you are not really arguing the original point imo. What are the dialectical variations in "ll" and "c"?


(B2-ish Spanish learner here but) "ll" is pronounced in at least three variants that I know of: "y", "j", and something between "sh" and "ch". E.g. "llama" might be pronounced like (in English writing) "yama", "zhama", or "shama". The last one really threw me for a while; it's super common in Argentina at least.


I spent time in the "Rio de la Plata" area in the late 1970s, mainly Montevideo, and learned rioplatense Spanish, and would use the ZH sound as in "meaSure" for Y/LL letters in "playa" and "calle".

In the last 40 years I've spent mostly in the USA I rarely have heard Uruguayan/Argentinian Spanish in person or in media, but was surprised to hear Messi and others in recent interviews use SH as in "puSH" for the Y/LL, this apparent has been a generational shift in that area, first in Argentina and then Uruguay. I'd sound old-fashioned if I were to go back to Montevideo these days.


I see what you mean. I think you should stick to one form and learn by difference or you could quickly get lost.

"ll" in standard spanish is a strong english "y".

However, in spanish argentinian from the area of Buenos Aires (but not the argentinian Córdoba, which sounds more like colombian spanish) it is "sh", being that s something like a mix in-between of "j" and "s" + h as in "she" but the sound is a bit different.

Without being able to record some sound I cannot express it better but I am sure you can find something around. Javier Milei, the president, has such an accent.


AFAIK "ll" can also be the palatalized "l" sound in some dialects, i.e. in the same relationship to regular "l" as "ñ" is to "n". Indeed, this is the original pronunciation from which all others have diverged.


as has been stated many times in this thread, the rules are also very clear in English. They just aren't taught.


I think that must have been within one dialect. If you include all dialects of English (Scottish, Irish, Australian, Singaporian, Indian, American, etc. etc.) I'm sure you have a lot more than 43 phonemes.

In any case, her point wasn't to give a lecture on linguistics, but to impress upon the parents how complicated English really is to learn to read.


The dialects I can buy it but I think the x has only two ways? It is a very regular language from the point of view of the written mapping to sound.


x is pronounced four different ways in Spanish: like j in México, like the English “sh” in Xcaret, like s in xenofobia and like English “x” in extremo.

The first two are not productive now in normal Spanish words: they are only used in old spellings that have irregularly been retained, and in loanwords from indigenous languages. But they do exist.


Well, yes. I was speaking about standard Spanish from Spain.

Xenofobia is an s, yes, and excursión is "ks" In fsct, Méjico is the traditional way to write Mexico in Spanish grom Spain until it was accepted the other form a few years ago. I still write "Méjico" myself.


Since less than 10% of Spanish speakers are from Spain, there’s no reason to assume you were specifically talking about that one country when referring to the Spanish language in general.

And anyway, as you point out, even in Spain the form México is accepted now.


I thought it is perfectly reasonable to talk about spanish from Spsin the same you talk about English from England.

After all, it is where they come from originally and have their own spelling (colour vs color, etc.)

An x in standard spanish has always been the two sounds I told you and that mexican deviation is specific to Mexico.

Yes, it is over 100 million speakers but I was still assuming the root language in its original place as the reference. Sorry if I did not express it correctly.


I get your point, but FWIW, México is not a Mexican deviation; it's just an older Spanish spelling. E.g. Jiménez was once spelled Ximénez and there are probably lots of other examples.

The "root language spoken in its original place" absolutely did pronounce X like modern J.


True, I forgot that detail. Ximenez did exist in fact and I forgot that. So it must be that.


Dialects are different since they are still internally consistent.


> phonemes (sounds the mouth makes)

This isn't entirely correct. A distinct sound that the mouth makes is a "phone". A phoneme is almost always a group of several phones - allophones - that native language speakers perceive as a single sound. Another way to phrase it is that if you change one phoneme to another one, it makes a different word (possibly a non-existing one, but regardless the native speakers would consider it distinct), but changing from one phone to another doesn't change the word.

For example, in English, the phoneme /t/ has allophones [t], [tʰ], [ɾ], or [ʔ] depending on context. OTOH [ɾ] is a distinct phoneme in Spanish, and [ʔ] is a distinct phoneme in Arabic.

Unfortunately these two are often confused, so one should be careful with such counts and comparing them - it's not uncommon when people count phonemes in their native language, but phones in other languages (when those phones sound distinct to them).

This can also vary significantly from dialect to dialect, since one very common thing in language evolution is for two similar phonemes to collapse into a single one while retaining the original distinction as allophones. For English, in particular, the number of phonemes varies a lot between American and British English (with the latter having more distinctions).


Spanish "maps" very nicely but even Spanish isn't exactly 1:1

- /k/ can be written both c and qu, and k where it occasionally appears in the language (e.g. kilo) - and the u in qu is silent.

- /s/ can be written c, s, and z, though stress rules are different for c and z.

- r and rr are distinct sounds but r = rr at the beginning of words, I think.

- At least in Mexican Spanish: The "ua" sound can be spelled ua or oa (e.g. Michoacan, Oaxaca) - and also the breathy sound of j can also be written with an x.

- d has a sound a little like English voiced-th at the end of words (e.g. juventud)


Not an issue with /s/ in Iberian Spanish. Once you have 'distinción', most ortography errors I've read overseas plummet down there.


qu: the u is always silent and qu is followed by i or e. It is still a systematic way of reading. It is like gue and gui, you pronounce as in "singer" the "ge", the u is mute. If you want to pronounce the u, as in pingüino, you set the diaeresis.

The stress rules, to the best of my knowledge, is very systemaic (not 100% but I would say "almost" at least for the words in use). Even the stress rules are very uniform.

> r and rr are distinct sounds but r = rr at the beginning of words, I think.

This is still systematic reading. At the start of a word it is the strong one, yes. And when it is preceded by a consonant, such as in "enredar" (that is strong r). There is no exception of any kind here.

> d has a sound a little like English voiced-th at the end of words (e.g. juventud)

That is some dialects in some areas. We pronounce a clean d at the end in my area (around Valencia). It is also the correct, standard way to do it for spanish. The other is a deviation existing in León, for example.


Yes, I'll always remember the long time spent asking for the whereabouts of Ocean Drive, mispronounced by me because the correct pronunciation would require the word to be written as Oshean or maybe Oshan. It was 1995. I have had very few occasions to hear native speakers. A lot of people and I were figuring out plausible but incorrect pronunciations by applying the most usual pronunciation rules to the written words.


its just a soft c? other languages have that too. English has soft c's after i, e and y. They probably didn't teach you that rule. its true that most English speakers pronounce soft c's as a "sh" sound these days, but it wasn't always the case.


If it were just a soft C, then "ocean" would sound more like "oh-see-in" or "oce-yin". But it's also been palatalized to sound like "oshin" in typical pronunciation. People might not have understood them because they didn't know this.


> Spanish is totally systematic in this sense and once you can read it, you can pronounce it.

IMHO purely phonemic orthography makes orthography unnecessary complex, as there are language features like assimilation[1] that happens naturally in spoken form but does not make sense in written form.

In contrast, morphophonemic orthography keeps systematic and consistent mapping between spoken and written form for individual morphemes, but not necessary for words, as in written form morphemes are just concatenated (to make words), while in spoken form there may be complex interactions.

[1] https://en.wikipedia.org/wiki/Assimilation_(linguistics)


It's not so strict, but we try most time to keep it consistent. For example, here in Buenos Aires we almost don't say the "d" at the end of the word, like in "ciudad" (city), in some pronunciation guides I saw it written with a tiny d.

If the variant get's too popular the two versions become the official spelling, for example "septiembre" and "setiembre" (September) are correct. I hate the second one and I never use it, but it's popular somewhere. After many years, sometime the old spelling disappears and is marked as archaic.


An orthography that surfaces (non-phonemic) assimilation would be phonetic rather than phonemic. For example, many languages assimilate "n" to "m" before "b", but the phoneme is still /n/, and native speakers are often not even aware that this assimilation occurs (which is what indicates that it's still the same phoneme).


Strictly speaking, spanish has the same sound for v and b, unlike other romance languages. G and j when followed by e or i also.


This is true, but Spanish orthography isn't completely phonemic (and simpler for it). It is very shallow and very consistent but it doesn't spell out things like assimilation differences, people are just wrong to describe it as completely phonemic.


as i understand it, english is actually 3-5 other languages in a trench coat.


This often gets trotted out, but it's not really true. English is a solidly Germanic language, which merely happened to lose the core attribute of Indo-European languages (extensive verb inflection), and in more recent centuries, there's been a tendency to adopt Latin and Greek words for new word formation rather than (as German did) using native words. So 'technology' instead of 'craftlearn' or 'television' instead of 'farsight'.

Even among major languages, English isn't anywhere near the worst offender of copulating with other languages for features--it never really adopted foreign grammar, the way you see with, e.g., Turkic languages.


Solidly Germanic with an absurd amount of French, down to nearly identical spelling for many common words. I’m not talking about cognates but actually 100% the same spelling and meaning and they’re often not from some recent century but from old French.

I’m sure you have a solid basis for saying this but it’s basically impossible to write many sentences without by accident using French down to the original spelling.

I was going to highlight all the examples I used by accident myself in this post but I gave up because the links were making it too long.

This is why something like Anglish even exists https://en.wikipedia.org/wiki/Linguistic_purism_in_English


I believe this is because England was conquered by the Normans (french speakers). I think it was within the last 100 years or so that the English aristocracy finally stopped speaking French among themselves.


> in more recent centuries, there's been a tendency to adopt Latin and Greek words for new word formation rather than (as German did) using native words

Note that the prevalence of native words in German is the result of a modern reform movement, not something that happened naturally within the language.

> [English] never really adopted foreign grammar

There's the argument that do-support is borrowed from Celtic.


As I understand it, English at it's core is a Germanic language that underwent significant creolization with scandinavian sources. That core then acquired a significant amount of Old French and latin vocabulary, particularly in upper class terminology.

The creolization is why English has a relatively simple grammar, and all the word sources is why we have like 16-20 vowel sounds trying to cram into latin characters.


> English has a relatively simple grammar

You mean "relatively simple morphology". English phonology and syntax are not simple at all (e.g. lots of information carried by word order).


Let's not downplay the influence that the French language had on English.


There's a really good podcast [1] that dives into the background of English. It starts off even further back, talking about PIE and how that affected all the earlier languages of the region. And then starts tying the pieces together on how English was formed.

[1] https://historyofenglishpodcast.com


At least 3:

~26% Germanic

~29% Latin

~29% French

~16% Other

RobWords covers this really well: https://youtu.be/PCE4C9GvqI0?si=4Wd6NFus4v1YqmC3


That might be true if you just count up every word in the dictionary by origin. However if you weight the words by frequency, Germanic will be way higher. That is, if you take a transcript of an average conversation in English, the proportion of words inherited from Old English (i.e., Germanic) will be much higher than 26%.


Almost all the most used words in English are Germanic. Latin in particular is overrpresented because of scientific and technical terms which are rarely used.


It seems hard to measure with any kind of objectivity here, considering how much Latin is in French (and even modern German) as well.


Blame the Normans for that one...well English was already kind of a mess, but the the conquest of England by the Normans really sealed the deal.


> Spanish is totally systematic in this sense and once you can read it, you can pronounce it.

is there no accent variation in Spanish?

Such a 1:1 system would never work in English, because the way words are pronounced can be very different in e.g. Melbourne, Newcastle-upon-Tyne and Boston, for example.


One of the problems in english (not the only one, but one of them) is that for the vowels there are 5 graphs (is this term correct? Sorry but hope it is understandable) but many more sounds. In Spanish there are 5 vowels in the latin alphabet and exactly five sounds and nothing else.

Valencian has 7 sounds though, two for e and two for o. Similarly, Catalan also (and in some circumstances the o sounds as u, when the stress is not in it and other stuff). But they still have quite strict rules.


Yeah but we represent a lot of vowel sounds by combining vowels - 5 letters (not including y), if we allow any combo of two to represent a different sound that's 25 combos, and if we remember that preceding and following consonants can modify vowels too (though, dough, caught bought vs thou, bao, sour, or; on, con, Ron vs how, cow, ow) that's quite a lot of combos.

Now, you can (and should!) accuse me of cherry-picking examples, since the rules are less consistent and/or vastly more complicated than what I represented. But I maintain that there are orders of magnitude more ways to represent vowel sounds than 5, and the clue is the context. Not, as many will suggest, memorizing each individual case (though there's certainly plenty of that going around, much like Spanish's infamous irregularly verb conjugations), but understanding categories and families and patterns.

English sounds usually are best understood with groups of three letters, rather than one letter at a time. If you looks at throuples, you'll likely find far more of that consistency we all so deeply desire.


Yes, English is VERY consistent. The problem is that there are multiple systems working inside English vocabulary, so you have to get familiar with more than one rule set.


You're right to point out that English pronunciation varies widely across regions, but that doesn't fully negate the value of a systematic orthography. What germandiago is referring to is the relationship between graphemes (letters) and phonemes (sounds). Spanish has a highly phonemic orthography, meaning the rules for converting letters to sounds (and vice versa) are consistent and predictable. Yes, there are accentual and dialectal variations within Spanish (e.g. seseo in Latin America vs. ceceo in parts of Andalusia) but these are largely phonological shifts applied systematically, not random deviations from spelling norms.

In contrast, English has a deep orthography, where historical layers (e.g. Norman French, Old Norse, Latin borrowings) and sound changes (like the Great Vowel Shift) have led to a chaotic mapping between spelling and pronunciation. A consistent system wouldn't eliminate dialectal variation, but it could reduce ambiguity and aid literacy, as evidenced by languages like Finnish or Korean.


I don't know if Korean is ultimately that good. Hangeul are a monstrous improvement over the old mixed script (which itself is better than the Japanese iteration because the Koreans only used Chinese characters for Chinese loans), but it still has a lot of sound change rules and can be a bit of a pain to read because of how letters flow to the next syllable. It's not in the same league with Finnish or Spanish, at any rate.


Yeah there are multiple accents in Spanish, but each accent is still a 1:1 mapping from written word to pronunciation, there's no enough/through/dough nonsense.


For example for a small car ("auto") you say and write:

In Argentina: "autito"

In Colombia: "autico"

In Spain: "autillo"

the same rule applies for all words, not only for cars.


In Spain you'll listen the three cases at once and all of them are perfectly valid.

-ito it's almost the universal way everywhere in the Hispanic world.

-ico it's widely used in the South of Navarre and Aragón and everyone will understand you. Heck, it's the diminutive from used by the hick people, and thus, it's uber known, altough you might look like a bumfuck village redneck sheepherd with a beret by using -ico outside of Navarre/Aragón.

-illo it's more from the South, but, again, understood everywhere.


In Argentina everyone will understand you, but if you don't use "ito" then people may ask where are you from.

"ico" is used in many countries of Central America and Caribe. I asked someone from Colombia, so I'm sure about Colombia but I'm no sure about every other country.

Is "illo" used in Madrid? I think I heard it in movies or TV programs from Spain.


Yes, it's used, all over the whole country.


The explanation you gave is already contained in the cited Wikipedia article. I think this "ghoti" example is more of a tongue-in-cheek mocking of pronunciation inconsistencies. If you want a jarring example, consider laughter and slaughter. I know, i know, they have different origins, but still, it confuses foreigners like me while learning the language.


But English orthography isn't meant to serve foreigners.

Im ESL, I struggled with English spelling as much as the next latin speaker who's already learned to read and write in foreigner.

But now that I get the reason behind it, I love it. I consider English orthography worthy of UNESCO protection, even. In fact, I am annoyed at the regular spelling of my two latin languages that have left so much history behind.


English Orthography doesn't exactly serve native speakers either.


It’s fairly good at helping us understand the etymology. Have a “y” acting as a vowel in the spelling? Good chance it’s Greek. Have a “k”? Almost certainly not Latin.


That is trivia that is useless in almost all contexts. I've been a native English speaker all my life and this is the first I've heard of that. I can't think of any situation in life where knowing that fact would have been helpful. Your claim seems reasonable, but if someone says you are wrong I wouldn't fact check it even if clear links were posted so that I could.


If you’re seeing a word for the first time, it is pretty useful - partly with pronunciation but definitely with meaning.

You do have to have some familiarity with the source languages, but if it’s an unfamiliar but nativized word, those are almost always ultimately Latin or Greek.


If you're seeing the word for the first time and need to figure out how to pronounce it, how would you know that “y” is acting as a vowel and not as a consonant in the first place?


If it's followed by a vowel, it's likely a Germanic word: yule, your, young, yellow (and you probably know the word, since our core vocabulary is still mostly Germanic). If it's at the end or between consonants, like syllabary or ontogeny, probably Greek.

You might also just happen to know a smattering (or even a lot) of Greek and Latin.


Im a materials scientist and I use etymology every day.

Knowing etymology is a an easy way to memorize things.


But if you had known it (aka, if anyone had taught it to you), it wouldn't be useless, as you would know the context and how to pronounce it...not to mention the meaning behind it


For sure, but I'm not sure this the primary purpose of a writing system.

By all means a fun aspect about English that you can look at a word and guess the origin, and it's pretty satisfying to pull it up on google to see your high because you looked at spelling. This novelty has come at the expense of many other things that would have increased its utility.

I guess I'll add one thing to list benefits, this probably has resulted in different dialects of English writing things the same way despite saying them differently. Singaporean English is very different from Scottish English, but the written form of the same statements for the most part decipherable by the other dialect.


> But English orthography isn't meant to serve foreigners.

Or natives. It is slower for children to learn to read English than other languages.


Teaching my toddler to read now and I definitely feel like if we spoke Spanish my work would be done already.


Probably not. Toddlers generally don't have the brain to learn any reading. Spanish's advantages in reading isn't how young you can start learning to read, it is how fast you can stop reading. Spanish schools stop teaching reading takes about 5 years to learn, English 6, and Japanese 9 - after that much training kids are finally considered to read anything. (sometimes we talk about college level reading, but that is more about mastery of topic specific topic - Doctors, lawyers, and engineers each have special vocabulary that needs extra training to read, but they cannot read each other's technical papers)


I learned how to read in six months.

My kid took two and half years.

The Chinese take 10 years.

So what? Are the Chinese terribly educated?


English is not a phonetic language and it also lacks accents.

Saying it has pronunciartion rules it is an strech. You have conventions.

In languages like spanish if you read a word, is very hard to misspronounce it.


no, for the millionth time, English has rules, NOT conventions, you just need to know the historical context behind the multiple rules.


> Turning 'ti' to /ʃ/ is a fairly normal affricatization

It can't be an affrication, because /ʃ/ is not an affricate. (Although /tj/ is affricated, as /tʃ/ [think "gotcha"] - when you say 'ti', you're referring to words that were pronounced with /s/ rather than /t/.)

Wouldn't /sj/ -> /ʃ/ usually just be called "palatalization"?

(The specific phenomenon in the context of English appears to be called "yod-coalescence".)


I want to know who thought that chinese transliterated into "english characters" should use a whole bunch of q, x and zs to represent sounds in a way that no other english word does.

Why is Zhou pronounced that way?!


Pinyin was written by Chinese speakers for Chinese speakers. There are other romanizations written by westerners, and these are easier to see where the sounds come from; e.g., "tsai" rather than "cai".

What use is "q" as a letter at all in English? It makes a "k" sound and always occurs with a "u" after it. Why not use it for the "tch" sound? (Which, btb, is different than the "ch" sound.)

"C" is about the same -- by itself it always sounds exactly like "k" or "s". Why not use it for the "ts" sound?

As for "zhou" -- in English, z is very similar to an s, but voiced. So in pinyin, zh is just like ch, but voiced.

Lots of languages do this BTW. When people from Wycliffe want to translate a Bible into an obscure language without a writing system, they first have to invent a writing system. They could invent all new characters, but why? All it would do is make that language hard to type. So they take the sounds that language has, and map them onto Latin characters. Sometimes there's an obvious mapping, sometimes not.

Look up Welsh's spelling for another example of this.


> Why not use it for the "tch" sound? (Which, btb, is different than the "ch" sound.)

What are you thinking of? There is no difference between those things.

But your major point here is correct; on the fundamentals there is no reason for the English alphabet to feature a Q.

> "C" is about the same -- by itself it always sounds exactly like "k" or "s". Why not use it for the "ts" sound?

With the modern alphabet there's no reason for a C either. However, the answer to "why not use it for the 'ts' sound" is pretty obvious - that sound isn't part of the English phonemic inventory. It occurs, but that is almost always just a result of what is supposed to be a bare /t/ being followed by /s/ for grammatical reasons. (For an example of the general feeling here, note that an English word cannot start with /ts/ at all.) Why would we use any letter to represent the "ts" sound? We represent it the same way it exists in our language, as a sequence of two unrelated sounds.

> So in pinyin, zh is just like ch, but voiced.

Technically the only voiced consonants in pinyin are m / n / ng / l / r. I think a voicing contrast was present in Middle Chinese, and there's one today in Shanghainese and presumably other Wu dialects, but not in Mandarin.


> What are you thinking of? There is no difference between those things.

I'm talking about pinyin here. In Mandarin, there are to distinct sounds, one represented in pinyin by 'q', and one by 'ch'. It took me months to hear the difference, and months more to be able to pronounce them properly. I think there are other romanizations where the 'q' sound is represented "tch".

(In fact, I'm inclined to think that there are actually two different sounds in English as well; "witch" and "Charlie" don't feel the same in my mouth.)

> Technically the only voiced consonants in pinyin are m / n / ng / l / r.

I think we're using different definitions of "voiced". Other voiced / unvoiced pairs in English include g/k, b/p, v/f, z/s. See [1] for an "official" example of "voiced" being used the way I'm using it.

How else would you describe the difference between "qu" and "ju", or "chou" and "zhou"? The only difference I can feel is when your vocal cords turn on.

[1] https://en.wikipedia.org/wiki/Plosive#Voice


> In fact, I'm inclined to think that there are actually two different sounds in English as well; "witch" and "Charlie" don't feel the same in my mouth.

There aren't.

> I think there are other romanizations where the 'q' sound is represented "tch".

Well, maybe; there are a large number of romanizations of Mandarin. But there are no significant romanizations where that is true. It's q in pinyin, ch' in Wade-Giles, and ts' or k' in postal romanization.

> How else would you describe the difference between "qu" and "ju", or "chou" and "zhou"? The only difference I can feel is when your vocal cords turn on.

You could read my other comment in the thread. qu and chou are aspirated; ju and zhou aren't. Your vocal cords don't turn on at different points for those syllables. Mandarin Chinese doesn't use voicing contrasts.

> I think we're using different definitions of "voiced". Other voiced / unvoiced pairs in English include g/k, b/p, v/f, z/s. See [1] for an "official" example of "voiced" being used the way I'm using it.

Yes, I know what voicing is. You don't seem to know what consonants are used in Mandarin.

Compare https://en.wikipedia.org/wiki/Standard_Chinese_phonology#Con... .


> qu and chou are aspirated; ju and zhou aren't. ...Compare [ref]

So the idea here is that chou and zhou are related in a similar way that the t's in "top" and "stop" are related: your mouth and vocal cords are doing the same thing, but in one case you have the puff of air and the other you don't.

At any rate, going back to the original question: the logic behind the choice is still consistent. On this classification, in Mandarin, p and t and ch are aspirated, and in English p and t and ch are voiceless; b and d and j and zh are unaspirated, and in English b and d and j and z are voiced. (And q is mainly thrown in to fill the gap, but its pronunciation in English is voiceless as well.)

Or, to explicitly quote from the ref you shared:

> Such pairs [of aspirated and unaspirated plosives and fricatives] are represented in the pinyin system mostly using letters which in Romance languages generally denote voiceless/voiced pairs (for example [p] and [b]).


Languages usually have either the voiced/unvoiced distinction as phonemic, or the aspirated/unaspirated distinction. In the former case unvoiced consonants often have aspirated allophones as in English, and in the latter case unaspirated consonants often have voiced allophones especially between vowels, as in Chinese or Korean. Hence why it makes sense to map the two in this manner - if your native language uses aspiration as the primary feature, and you hear someone who uses voicing, your brain will generally map it "automatically" for you, and their speech will sound weird but understandable.

(But then you get Hindi with a four-way distinction, both voiced/unvoiced and aspirated/unaspirated in all possible combinations.)


> But then you get Hindi with a four-way distinction, both voiced/unvoiced and aspirated/unaspirated in all possible combinations.

They're spelled that way; I don't think they're supposed to be pronounced that way.

https://en.wikipedia.org/wiki/Aspirated_consonant#Voiced_con...

>> True aspirated voiced consonants, as opposed to murmured (breathy-voice) consonants such as the [bʱ], [dʱ], [ɡʱ] that are common among the languages of India, are extremely rare.

> Languages usually have either the voiced/unvoiced distinction as phonemic, or the aspirated/unaspirated distinction.

My understanding is that all of these options are fairly common:

- two-way contrast between aspirated and unaspirated

- two-way contrast between voiced and voiceless

- three-way contrast between voiceless aspirated, voiceless, and voiced

- three-way contrast for labial and alveolar stops; two-way contrast for velar stops


> They're spelled that way; I don't think they're supposed to be pronounced that way.

True, but most languages don't distinguish between [h] and [ɦ] to begin with, with one often the allophone of the other. So listening to Hindi it sounds like the same thing, more or less.


> Languages usually have either the voiced/unvoiced distinction as phonemic, or the aspirated/unaspirated distinction.

Yes, that makes sense -- I certainly learned something from this conversation. It makes sense that speakers would naturally tend to classify things along different lines, and in Chinese the aspirated / unaspirated classification makes sense.

That said, after having had some time to sit with the proposition that 'j' in the English name "Joe" is voiced, and the "zh" in Chinese word "zhou" is unvoiced, it continues to seem obviously false to me. It seems very much to me like mistaking of the map for the territory [1].

[1] https://en.wikipedia.org/wiki/Map%E2%80%93territory_relation


When you're listening to it, you are hearing the phone. That may well be voiced, even if the underlying phoneme is unvoiced.

To determine the true nature of the phoneme in a given language, you need to "flip the bit" on voicing (importantly: without adding/removing aspiration!) and see whether native speakers will treat it as different or not.


The article you linked to specifically says there are only voiceless plosives in Mandarin!


And, you'll notice I pointed out English voiced plosives. :-D


It's best not to think of Hanyu Pinyin as using "English characters" to pronounce Mandarin. It's just a mapping of the initial, medial, and final sounds onto the Latin alphabet in a consistent way, so that once you know the mapping, you know the pronunciation right away, and more practically, you can _type_ it right away.

https://en.wikipedia.org/wiki/Pinyin


I used to always think these romanization schemes were really bad, until I realized they were just not for me. The ease of sight-reading and getting the correct pronunciation for a random english speaker is not the goal. It's primarily for the convenience of users of other languages to have a systematic encoding. To make it pronunciation-friendly you would have to have to add a bunch of complexity to the mapping that would compromise its usage by the real audience.


A few plausible answers to that:

In general, it's not transliteration into English characters, it's transliteration into the Latin alphabet. That means that transliteration tends to be shared across the various European languages that use the Latin alphabet. And given that the English were one of the last powers to actually engage in the naval trade war, they're less likely to be the basis of a major transliteration effort.

In the case of the q and x, I believe it comes from 500-year old Portuguese.


> That means that transliteration tends to be shared across the various European languages that use the Latin alphabet

Not just European languages. Pinyin is useful for everyone that has to interact with Chinese words, whether their first language is English, French, Swahili, or even Mandarin.

A lot of people might not realize that the primary users of Pinyin are Chinese people. The way typing Chinese works is that you type the pronunciation in Pinyin and then a box pops up with choices of characters from which you select the correct one. It's also used in dictionaries to give the pronunciation of unfamiliar characters.


Your first question, who thought of the system, has a straight answer. From Wikipedia:

> Hanyu Pinyin was designed by a group of mostly Chinese linguists, including Wang Li, Lu Zhiwei, Li Jinxi, Luo Changpei, as well as Zhou Youguang (1906–2017), an economist by trade, as part of a Chinese government project in the 1950s.

By the way, they are not “English” characters; they are Latin/Roman characters, and used in a huge number of languages with different spelling conventions. Pinyin was created for the entire world to use, not specifically English speakers.


Who called them "english characters"?

Pinyin uses s in a very common way, z in the way of Italian, and c more or less in the manner of various Slavic languages. They are a sequence of related sounds: s is the fricative, z is affricated, and c is both affricated and aspirated.

Sh, zh, and ch are a sequence of sounds related to s, z, and c. Sh is a fricative articulated farther back in the mouth, zh is its affricated form, and ch is both affricated and aspirated.

And as a bonus, sh and ch match English usage, which isn't likely to have been a primary concern.

It's also worth noting that for many Chinese speakers, there is no difference between s/sh, z/zh, or c/ch.

(x, j, and q are what you get if you use the middle of your tongue, instead of the tip, to pronounce sh/zh/ch. They occur before front vowels; sh/zh/ch only appear before back (or central) vowels.)

A friend of mine remarked to me once that when she was in school, her teacher informed the class that English speakers would not understand what the pinyin letter "q" was supposed to mean, which I immediately confirmed. She thought this was hilarious.


Well that is a good point. For some reason I just assumed that pinyin was specific to english and that other languages used different transliteration schemes.


The English are definitely characters.


How would you spell that sound in a way that is consistently recognized?

"zh" is actually one of the more reasonable pinyin digraphs because it follows the same pattern as "sh". If "s" + "h" results in [ʃ], then logically "z" + "h" should result in [ʒ].

"c" is used the way pinyin uses it in many languages (e.g. pretty much all Slavic ones that use the Latin alphabet, for starters).

"x" and "q" are more questionable, but there's precedent for either in languages using Latin-based alphabets - "x" can be [ʃ] in Spanish, for example, and "q" is [c͡ç] in Albanian.


> "zh" is actually one of the more reasonable pinyin digraphs because it follows the same pattern as "sh". If "s" + "h" results in [ʃ], then logically "z" + "h" should result in [ʒ].

Note that the sound [ʒ] is common in Mandarin, but its pinyin spelling is "r". "zh" isn't voiced and is affricated.


Wait til you get a load of Tamil/Malayalam transliterations’ use of “zh”. It was proposed by some German linguist to represent a really retroflex “r” and now makes outsiders pronounce kozhikode as “cozy-code” instead a closer “korikode”


You're correct on the reasons why "ghoti" cannot be pronounced like "fish," but what your explanation illustrates is that the mapping from English spelling to pronunciation is extremely nuanced - needlessly so.

A more direct phonetic writing system, like many other languages have, would make it much easier to learn how to read and write English.


Agreed, a far better critique of English spelling:

https://people.cs.georgetown.edu/nschneid/cosc272/f17/a1/cha...


> Even turning the 'o' into /ɪ/ relies on fairly regular vowel destressing

Isn't the "o" in "women" stressed?


It depends on your accent, within a couple steps of me I can find someone that pronounces it “Wimmen”, “Wuhmen”, or “Woemen”.


English is a particularly challenging language to spell with. How many languages have a national spelling competition?


As a non native, it still bothers me how "toward" is pronounced, "toord", really?


That's just one accent. Most accents pronounce that W (especially outside the US).


come to texas and experience a whole universe of dipthongs (one of which remedies this)


‘W’ started out as a long ‘U’ so it’s not unreasonable


It was spelled as a double U originally (hence the name), but that doesn't mean that it was pronounced as a long U! It was always an approximant.


It's not, unless you're a yankee. They're going to hear you're a foreigner anyway, might as well speak Queen's English.


If you think that’s crazy, consider that “English” is the only word in the English language that spells the /ɪŋ/ (‘ing’) sound as “eng”.


Eh, not really.

"engage", "engorge", "engrave", "engross", "engulf" are all fairly common words that are either often or exclusively pronounced that way (some dictionaries might show /in-g/, but /n/ is really /ŋ/ before g or k, even if they remain). Since these can take prefixes, this also proves we're not limited to being at the start of a word. Searching for words that can be spelled with with "ing" or "eng" finds a few more but nothing super interesting (though a few are in the middle of a word).

Obviously words where "g" is pronounced /dʒ/ (like "j" for those who can't read IPA) aren't subject to this.


You might be right, but for what it’s worth I’ve literally never heard any of those words pronounced that way. I’ve only ever heard the word “English” start with the same sound as inside, while “engage” and your other examples start with the same sound as entertain.


While you're right, I feel like there's no safe argument to make here, because some group somewhere will pronounce some word in a certain way, so there can't really be a blanket rule.


- English - /ˈɪŋ(ɡ)lɪʃ/

- engage - /ɪnˈɡeɪd͡ʒ/, /ɛnˈɡeɪd͡ʒ/

- engorge - /ɪnˈɡɔːdʒ/

- engross - /ɪnˈɡɹəʊs/, /ɪŋˈɡɹəʊs/, /ɛnˈɡɹoʊs/, /ɛŋˈɡɹoʊs/

- engulf - /ɪŋˈɡʌlf/

According to Wiktionary only engulf and engross also use /ŋ/.


I've never heard engulf pronounced similarly to English


The stress is on a different syllable so it's kinda pointless to compare.


In my local (dialect and) accent all of these words have a pretty clear initial /ɛ/ and not /ɪ/. (But also: /ɪ/ usually contrasts strongly with /i/ here, but the sound before /ŋ/ is almost a third in-between vowel.)


Sorry, I was sloppy and wrote /i/ rather than trying to dig up how to enter the correct vowel when I was focused on the "ng".


Angland - Eng-land - Ing-land


As a native, "toward" is pronounced exactly like "to ward", but (usually) with the highly-unstressed vowel variant of "to". Remember that "w" is a semivowel, but it's not doing anything special here (at least in the vast majority of mainstream English dialects). In contexts where it is emphasized (or I suppose in more formal registers) it can strengthened to merely the normal lack of stress.

English might make more sense if someone actually sat down and wrote out the real stress rules, rather than trying to cram everything into just "unstressed" and "stressed" and only caring within a word.

=====

"To" might be one of the syllables with the most possible stress levels, with at least 4 and possible more. As I spell them,

1. "too" - full stress. Common for "two" and "too", but possible for "to" under rare circumstances.

2. "to" - less emphasized but still arguably stressed; still has the "proper" vowel. Usually this is as strong as "to" gets; "two" and "too" often fall down to this level if before a stressed syllable. Arguably this could be split into "stressed but near words with even more stress" and "unstressed but still enunciated" (which occurs even within a register).

3. "tah/tuh" - unstressed, the vowel mutates toward the schwa. Very common for "to", but forbidden in a few contexts. May be slightly merged into the previous syllable. Can we split this?

4. "t'" - very unstressed vowel has basically disappeared; may or may not remain a separate syllable from the one that follows (should that be split?).

The infinitive particle can't be 3 (normally 2, not sure if 1) if the following verb is implied (but not if the speech is cut off). At the start of the sentence it also can't be 3, and 1 is possible as seen below though 2 remains the default. Note that many common verbs act specially when before an infinitive particle; although sometimes treated as phrasal verbs it would be silly to treat them as taking a bare infinitive as their argument.

Adverbial particle "to" when the phrasal verb takes a direct object can be 2 or 3; this likely depends on the specific verb it's part of. Note that many people parse this as a preposition (taking a prepositional object), but this is technically incorrect (though there are some verbs where it really is unclear even when doing the rearrangement and translation/synonym tests).

Adverbial particle "to" when the phrasal verb does not have a direct object is usually 2 or even 1 (e.g. in the imperative). Some heretics have started calling this a preposition too (unfortunately, often in ESL contexts), but this should be avoided at all costs; they're just too cowardly to give particles the respect they deserve. Probably the only common example in modern English is "come to", but there are several others in jargon or archaic English.

Particle/preposition (the parsing is arguable) "to" used between numbers (range, ratio, exponentiation, time before the hour) tends to be 3, especially if one of the numbers is a "two". With variables it is slightly more likely to be 2.

Preposition "to" meaning "direction", or "contact", or "comparison/containment" tends to be 2, but can usually fall to 3 (less likely at the start of a sentence, and can also be prevented by what precedes it, e.g. "look to" can fall to 3 without much effort, but "looked to" strongly stays at 2). Contrast with "toward" of related meaning, which takes effort to get from 4 to 3.

Preposition "to" meaning "according to", "degree", or "target" (including but not limited to the explicit expression of an indirect object with most verbs, which we could argue should count as a particle instead. If you're wondering what verbs are excepted, one is "ask" - it can only use "of", as in "ask a question of him") is much more strongly 2, and requires significant effort to force it down to 3.

Adverb "to" is always 2 I think, but this is rare enough that I'm not sure.

=====

"To be or not to be", as famous as it is, has a pretty unusual stress pattern for most of its words: full stress on the first "to", semi-stress on the first "be", no stress (but still full length) on "or" (normal), full stress on "not", some stress on the second "to", and some stress on the second "be" (more than "to" but less than "not").


toward is pronounced exactly as its spelt in Canada




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: