Language Learners' Library

bugcat · October 8, 2020, 3:10pm

Yeah, the former is more of, concerning and the latter is similar to.

Verrius · October 8, 2020, 5:16pm

The hard th- / soft th- variation is interesting at the beginning of words: the essential grammatical building blocks all have hard th- (the, this, that, there, therefore etc.), the nouns and verbs and adjectives are all soft th- (throng, think, thrifty). I don’t have a dictionary hear, but I vaguely think that wreathes and sheathes are the verb, (hard th), wreaths and sheaths are the nouns. They’re not words you need to pronounce all that often.

bugcat · October 8, 2020, 5:18pm

Interesting, I think of the “hard” and “soft” th as being the other way around.

Verrius · October 8, 2020, 5:20pm

Ah sorry I was trying to take your terminology (because it was too much fuss to type the phonetic symbols) and obviously guessed the wrong way. In German it’s voiced and unvoiced I think (stimmhaft and stimmlos?).

bugcat · October 10, 2020, 11:22am

Recently, I broke the E key on my laptop (along with the Q and W). This inspired the question: can we construct a form of English which can be written without the letter E? An UnEsy English?

First, the definite article the needs to be replaced with “Northern eye dialect” t’.

We replace are with Swedish är (ar), or with art in the second person.

me easily changes to Low German mi. We can take that as permission to similarly alter he and she to hi and shi, and to change her to Middle English hir. Instead of they, their we can apply Scots thay: thay, thair.

-y plurals will have to keep their y: butterflys. The same is true for verbs: carry, carrys.

With past-tense forms such as walked and painted, there’s no choice but to elide: walk’d, paint’d.

With that said, let’s try to modify our vocabulary.

Esy	unEsy
red	crimson, carmin, ruddy, ruby g. rot, rod, rood
yellow	gold, flax, me. abron
green	virid, cupric, g. grun, gron
blue	dialectal blow, g. blau
orange	Esperanto orango
purple	me. / g. purpur
white	me. whit
grey	American gray
mouse	rat, oe. mus
squirrel	American squirl
badger	brock
snake	oe. snaca, Swedish snok
newt	dialectal askard
pigeon	o.f pijon
magpie	l. pica
egg	Swedish agg
tree	me. trow
river / stream	g. strom
lake	loch
sea, ocean	sw. sjo, me. occyan
house	me. hous
bear	g. bar, Dutch bjorn
eel	Low German ool, Scandinavian al
crocodile	me. cocodrill
deer	Norwegian dyr
axe	me. ax
hammer	oe. hamor
horse	me. hors
giraffe	Arabic zarafa
monkey	Spanish mono
saddle	oe. sadol
knife	me. knyf, knif
glove	oe. glof
shoe	me. shoo

All-in-all, a rather archaic and Germanic-feeling sample.

Lys · October 10, 2020, 1:42pm

Lattice is quite a new word for me. I bumped into it watching some Veritasium’s video about crystals or something.

The weird thing is that we have lattice in italian with a completely different meaning. Dictionary translation is latex, as in latex gloves.

So, every time I hear the english word lattice I have to force myself not thinking about crystals made out of rubber.

Samraku · October 10, 2020, 2:52pm

When I think of “lattice”, I think of something a bit frail looking, though not necessarily prone to breakage in practice. Like a doily or an infinite square grid of 1 ohm resistors. (what would the resistance between to points a knight’s move apart in that grid be, anyway?)

HHG · October 10, 2020, 4:02pm

Barbanaira · October 10, 2020, 8:34pm

Is t’ bookstaff of which you talk at all us’d in Anglish words with t’ pronounciation [e]? It appairs to mi that t’ Anglosaxons only uz it as an auxiliary bookstaff.

DVbS78rkR7NVe · October 11, 2020, 8:45pm

Ufufu

bugcat · October 12, 2020, 1:22pm

I started thinking this morning: what if we were to treat English like Latin and identify its “noun declensions”?

The first, or “cat” declension, which contains the majority of nouns

X	Singular	Plural
Nom.	cat	cats
Gen.	cat’s	cats’

The second, or “bus” declension

X	Singular	Plural
Nom.	bus	bus(s)es
Gen.	bus’(s)	bus(s)es’

The third, or “tomato” declension

X	Singular	Plural
Nom.	tomato	tomatoes
Gen.	tomato’s	tomatoes’

The fourth, or “pastry” declension

X	Singular	Plural
Nom.	pastry	pastries
Gen.	pastry’s	pastries’

The fifth, or “sheep” declension, which includes Japanese loanwords

X	Singular	Plural
Nom.	sheep	sheep
Gen.	sheep’s	sheep’s

The sixth, or “wolf” declension

X	Singular	Plural
Nom.	wolf	wolves
Gen.	wolf’s	wolves’

The seventh, or “radius” declension

X	Singular	Plural
Nom.	radius	radii
Gen.	radius’(s)	radii’s

The eighth, or “stadium” declension

X	Singular	Plural
Nom.	stadium	stadia
Gen.	stadium’s	stadia’s

The ninth, or “larva” declension

X	Singular	Plural
Nom.	larva	larvae
Gen.	larva’s	larvae’s

Wait, I forgot:

The tenth, or “mouse” declension

X	Singular	Plural
Nom.	mouse	mice
Gen.	mouse’s	mice’s

The eleventh, or “goose” declension

X	Singular	Plural
Nom.	goose	geese
Gen.	goose’s	geese’s

Vsotvep · October 13, 2020, 6:54am

In fact, English also has an accusative case. It’s the same for nouns, but in most pronouns it still exists (me, him, her, us, them).

Samraku · October 13, 2020, 9:47am

Does it really count, though? I’d say more that English has irregular pronouns in that they decline for accusivity.

Evidence for this is that “Me loves she.” will be seen by a native English speaker not to mean “She loves me.” (as the cases indicate), but rather a grammatically incorrect way of saying “I love her.”.

Vsotvep · October 13, 2020, 9:56am

Wouldn’t that be more irregular than saying English has an accusative case, but that the accusative of nouns is equal to the nominative of nouns?

Other Germanic languages behave similarly. For example, with German, except for a few classes, most nouns have equal nominative cases and accusative cases. In Dutch, the accusative case is never different with nouns, but it is still there in pronouns, and occasionally articles and adjectives are declined as well (usually in fixed sayings, such as “goedenavond” or “op den duur”, instead of the modern “goede avond” and “op de duur”, both of which sound like a mistake).

This is something completely different. An English speaker would also have problems with “I her love” or “Love her I” and so on. Word order is important in English.

Barbanaira · October 13, 2020, 10:02am

There was the question on a subreddit called r/latin asking what difficulties a Roman transported to modern times would face when forced to learn English or a Romance language. Some really cool dude, whom I totally never met before, argued, that the fixed word order could be quite confusing for them.

Vsotvep · October 13, 2020, 10:13am

Word order is even different between German, Dutch and English. I often notice it when my mother speaks English, since she sometimes uses Dutch word order for things. For example:

Dutch order:

Hij schijnt eens Parijs te hebben bezocht.
He seems once Paris to have visited.
Er scheint einmal Paris zu haben besucht.

English order:

Hij schijnt te hebben bezocht Parijs eens.
He seems to have visited Paris once.
Er scheint zu haben besucht Paris einmal.

German order (I hope…):

Hij schijnt Parijs eens bezocht te hebben.
He seems Paris once visited to have.
Er scheint Paris einmal besucht zu haben

Barbanaira · October 13, 2020, 10:37am

= he seems to have visited Paris exactly one time. The position of “Paris” and “einmal” both have a minor stress here, the one on “einmal” being a bit more natural. Without stresses, it sounds like “Paris” is marked as something that is already established as topic of the conversation. “War Fritz schon einmal in Paris?” “Ja, er scheint Paris einmal besucht zu haben.”

Without any stresses the natural order is

Er scheint einmal Paris besucht zu haben.

which also introduces “Paris” as a thing that hasn’t come up in the convo yet. But even if ignoring all that, both orders are quite alright.

Interestingly, German dialects are undecided on the place of the participle in periphrastic tenses insubordinate clauses.

Fribourg German:

Sia hät gsììt, ass dù z Ggaffe scho hesch trùùche. [z̊ɪɑ̯ het ɡ̊sɪːt, ɑs d̊ʊ ʦ͡ kɑfːɛ ʃɔ heʃ trʊːχə]
She has said that you the coffee already have drunk.

Zurich German:

Si hät gseit, dass du s Kafi scho trunke häsch. [sɪ hæt ɡ̊sɛɪ̯t, d̥ɑs d̥u skχɑfɪ ʃɔ 'tʀʊŋkχɛ hæʃ.]
She has said that you the coffee already drunk have.

Standard German:

Sie hat gesagt, dass du den Kaffee schon getrunken hast
She has said, that you the coffee already drunk have.

Let’s ignore the possibility of a subjunctive in the subordinate clause. We see that the central-eastern Dialect of Zurich and Standard German put the finite verb at the very end, while the western Fribourg German and its relatives from Bern and the Bernese Upland (maybe the Wallis too? I’m not sure) like to keep the participle after the auxiliary.

Vsotvep · October 13, 2020, 10:41am

In fact, the thing about the order of “einmal” and “Paris” is completely analogous for Dutch.

Also, certain Dutch accents would find “bezocht te hebben” to be more correct, while others find “te hebben bezocht” more correct. Both are more or less correct.

corner.square · October 13, 2020, 11:57am

Yoda about forget do not

bugcat · October 14, 2020, 4:00pm

A Tour of Unicode

Unicode is one of the crowning achievements of modern communications science. The Unicode project aims, in loose terms, to provide a standardised code point for every letter, kana, hangul, CJK ideograph, or any other character used by every living or dead language. It also includes emoji, chess pieces, musical notes, and many other meaningful non-linguistic symbols.

Unicode is still incomplete, however, receiving its most recent update in March. At the last count, Unicode contained around 145,000 graphic characters (and a further 163 format characters); this is called the Universal Coded Character Set. The UCCS is divided into four main groups, called planes: Plane 0, The Basic Multilingual Plane; Plane 1, The Supplementary Multilingual Plane; Plane 2, The Supplementary Ideographic Plane; and Plane 3, The Tertiary Ideographic Plane. Much space is, of course, left empty as a precaution in case of new entries.

Note: this is a bit of a “layman’s stroll”, and I don’t know the subject well. It will likely crumble under committed pedantry ^^

The Basic Multilingual Plane
/—/

The BMP begins with Basic Latin, which encodes all characters used by English; and, indeed, Latin (apices aside).

Latin-1 Supplement and Latin Extended-A (and -B) provide support for letters such as Ø and ß. Together with Combining Diacritical Marks, providing accents, umlauts etc., most European languages can be encoded.

If Unicode is a great achievement of computer science, the International Phonetic Alphabet is such in linguistics, an attempt to provide a glyph to represent the sounds of every human language. Many of the IPA glyphs aren’t used in any other script, and as such are found in IPA Extensions.

The Cyrillic or Russian alphabet is, of course included; and is followed closely by code points for Armenian, and then for a number of Semitic languages: Hebrew, Arabic, Syriac and others.

Brahmic scripts come next, such as Devanagari (“the writing of the gods”), Bengali, Gujarati, Tamil, Telugu, Malayalam, Sinhala, Thai, Lao, Tibetan, and Myanmar. This block covers much of South-East Asia and India, and is later rounded off with the addition of Tagalog, Khmer, Balinese, Sundanese, and several other encodings.

We then have Georgian and a block called Ethiopic, which contains scripts for several important languages such as Amharic and Ge’ez. After that comes the relatively modern Cherokee script, and an intriguing collection called Unified Canadian Aboriginal Syllabics, featuring such languages as Inuktitut, Blackfoot, Ojibwe, and Cree (several of which are also found in the USA).

There is encoding for Ogham, which is a curious Medieval Irish carven script made up of intersecting straight lines of varying numbers. I recommend Tom Scott’s video on this. It’s followed by Runic which encodes, well, runes, and then by Mongolian.

Greek, of course, has its own alphabet which receives its own encoding. Greek precedes a varied block called simply “Symbols”, which includes such things as General Punctuation, Currency Symbols, Arrows, and Geometric Shapes. Coptic and Tifinagh (Berber script) come next.

We have arrived at CJK scripts and symbols At the begnning, it skirts around the great bulk of the subject. It provides radicals and other “description characters”, hiragana and katakana, bopomofo (sort of a kana for Chinese), kanbun (annotation symbols used by ancient Japanese scribes on Chinese texts), “rare Han characters”, and the symbols of the I Ching. Finally, about 21,000 characters are provided in the hugely useful block CJK Unified Ideographs, “containing the most common CJK ideographs used in modern Chinese, Japanese, Korean and Vietnamese”.

We return to business as usual by wading through a series of obscure languages and scripts: Lisu (Burma), Vai (Sierra Leone / Liberia), Saurashtra (India), Phags-Pa (Mongolia, historical), Kayah Li (Thailand / Burma), Rejang (Sumatra), and so on, until we arrive at the moderately well-known Javanese. This forms, more the less, the end of our journey through the Basic Multilingual Plane. I have, admittedly, skipped over much of interest in the interest of saving space.

The Supplementary Multilingual Plane
/—/

This plane begins with Archaic Greek and Linear B (now known to represent Greek), continuing through Old Italic and Gothic (a prominent East Germanic language of the Late Classical Period.) It proceeds to Ugaritic, Old Persian, Imperial Aramaic, Palmyrene and other languages of the ancient Near East; and to Phoenician and Lydian. Here we encounter our first set of heiroglyphs, Meroitic, a system used by the Kingdom of Kush on the Upper Nile at the beginning of the third century BCE.

Next on our tour, we pass through both Old North & South Arabian to reach Avestan, which was an important liturgical language in Zoroastrianism. We continue through Inscriptional Parthian and, before we know it, are amongst a great multitude of Brahmic scripts, all of which are very obscure to the Western eye.

On the other side is the Cuneiform block. Cuneiform is suggested by archeological evidence to have been the very first human script, originally used by the Sumerians, a Middle Eastern people who spoke a language isolate. The history of living cuneiform spans around three thousand years and is exceedingly complex, with the script having a mix of ideographic and phonetic properties. The best analogue, in my opinion, is Old Japanese.

After cuneiform, next in the plane are more heiroglyphs, these ones from Egypt and Anatolia. Further down is Tangut. Tangut is a very interesting script, created artificially in ancient North China with the inspiration of Chinese characters.

The Duployan block encodes a variety of shorthand systems used in French, Chinook, Romanian, and English. Below this block is musical notation: not only in the modern format, but also Byzantine and Ancient Greek. Game pieces are also encoded: there are blocks for Mahjong Tiles, Domino Tiles, Chess Symbols, and Playing Cards. Near the end of this plane are more miscellaneae: Alchemical Symbols, Transport and Map Symbols, and of course Emoticons.

Supplementary Ideographic Plane
/—/

This is just what it says on the tin, really: more CJK ideographs.

Tertiary Ideographic Plane
/—/

This plane was only added in the March update. Its only block so far is CJK Unified Ideographs Extension G, which contains “rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese”. Explicitly included in the block are two very complex and rather famous characters. Firefox, at least, hasn’t implemented these yet.

biáng (58 strokes), a type of broad Chinese noodle

taito, daito, or otodo, a kokuji (Japanese-coined character) with 84 strokes (the most of any CJK character). Note, though, that it is clearly five different kanji arranged in a circle around the sixth. From an aesthetic point of few, I find the “onion-skin” appearance of biáng more pleasing.

The Tertiary Ideographic Plane is also “tentatively allocated for Oracle Bone Script, Bronze Script, and Small Seal Script”, three forms of writing used in the early stages of Chinese civilisation.