Finnish on LingQ - Hurdles

Hi guys,

I understand Finnish is potentially going to be the next language on LingQ (I’m not on Facebook, but this is what I’ve been told). Hopefully it is added :slight_smile:

In any case, I’ve been preparing some content that I’ve found on the internet, and I’m not sure how best to go about this.

As you may know, spoken language in Finnish is quite different from the standard written form. If the audio is in “spoken Finnish”, should the text match this?

Some examples.

“Mä oon” instead of “Minä olen”. “Haluaisitteks te…” instead of “Haluaisitteko…”. Even a common word like “jonkinlainen” becomes “jonkunlainen” in the spoken form.

GoogleTranslate might be able to translate spoken Finnish in context, but standard dictionaries would probably struggle.

Perhaps we could have an accent category for “colloquial” Finnish.

What do you think?

Without any personal knowledge of Finnish but having witnessed this in languages, I’d say that it would be best to have Standard Finnish and Colloquial Finnish marked. It would be best to have collections in both forms. The standard form is used in formal situations and media, right? Then people will have the opportunity to use it and it will match what is in books too.

I think the text should generally refer to the written standard. Otherwise it’s a phonetic transcription, which is not really Finnish. I see no harm in having colloquial speech written out correctly because it both illustrates the phonetic changes and helps with improving comprehension by bridging the gap between speech and writing. A phonetic transcript could be included in the notes, although I don’t think it’s necessary by any means. Of course a line has to be drawn somewhere nonetheless. For example “mun kädet” is not correct, but hardly anyone would say “minun käteni”, although I suppose it could happen in some dialects. I think a compromise is to adhere to standardized spelling while leaving colloquial cases and conjugations unaltered, hence “minun kädet” as a neutral spoken form, but “mun kädet” could work too. It doesn’t make much of a difference if it’s minä, mä, mie, or some other variant as long as it’s understood that “minä” is the standard form. On the other hand, spoken forms like “haluaisitteks”, “miks”, “emmä tiiä” should be written correctly as “haluaisitteko”, “miksi”, “en m(in)ä tiedä” because they are not really acceptable written words in their truncated, crammed, or otherwise altered forms.

Imyirtseshem, there is no colloquial standard. There are several major dialects, and they differ from each other quite a bit. Peter’s example “haluaisitteks te” exists in a number of different spoken forms depending on dialect. I think it should always be written as “haluaisitteko te” with the content marked according to dialect (or just colloquial) based on the audio. Without a written standard you end up with a smörgåsbord of written dialects and spelling variants, and in the case of Finnish, that is beyond the pale of standardized written language.

@albumen - “I think a compromise is to adhere to standardized spelling while leaving colloquial cases and conjugations unaltered” – based on this, where would you stand on “mä oon”, “te ootte” “ooks sä”? I’m assuming those would need to be “minä olen”, “te olette”, “oletko sinä”.

You raised an interesting point which never really occurred to me: “Otherwise it’s a phonetic transcription, which is not really Finnish”. That sounds logical enough to me. So perhaps then it would be best to have everything in proper Finnish (text) and any content that has been recorded in a variety of spoken Finnish marked as “colloquial” under accent (although the text would still need to adhere to standard Finnish)?

“It would be best to have collections in both forms”(Imyirtseshem)
I agree. If a learner has no previous knowledge of the language, they will be lost and give up unless they can familiarize themselves with the standard language first (the language that newspapers, radio or tv news etc. use). I guess if you are familiar with some basics in the standard language you can then turn to colloquial content and get by with some notes that can be attached to the lessons. In Arabic there is no way around the standard language. And in German there is no standard written form of dialects or colloquial language either. So again, even if lessons like Who Is She and Eating Out may sound unnatural to native speakers, they should offer the standard form. An extra colloquial collection could be offered at the same time, so that learners can listen to both. Colloquial language can often be acquired through listening to comprehensible content alone. I’d be interested in Finnish as well as Hungarian on LingQ.

Some of the lessons that I’m going to upload (which form part of a course) are spoken in “colloquial” Finnish (for comprehension). They would probably still be useful for (even beginner) learners. The only question is whether to use proper Finnish as the text, or transcribe them “phonetically”. I think albumen has some good points.

Out of curiousity, I wonder how English LingQ conversations are transcribed, when it comes to things like: “It’ll be tricky if he’s said what she’d wanted him to say” … or… “You should’ve given them what they’d asked for”… or… “That’ll teach him, he’s got to learn or he’ll get in trouble…”.

Would this kind of text classify as a phonetic transcription?

Albumen: All writing that is not ideographic is: transcription. Finnish is Finnish, regardless of how it’s written. The idea that it isn’t sounds very illogical and nonsensical to me.

I don’t see why there couldn’t be various colloquial versions offered on LingQ. Take whatever people are willing to offer. More is better when it comes to language. :slight_smile:

Alleray - some “German” dialects (I hate that term) are developing standard-like writing systems now. There’s not exactly one system but rules for writing sounds within which the various dialects can write their own sounds. It’s a powerful thing for the brain, actually, reading in various different dialects. I know listening is to (my experience of Yiddish).

I know that German ‘dialects’ can be as different as separate languages (cf. Arabic dialects/languages), it’s just the point of view of the ‘standard’ language proponents. BTW there are Wikipedias in the regional forms of German and spelling can vary a lot. Still, if I were learning German I wouldn’t start with a regional variety.
Accent is different; as with American and British English (and other Englishes), the written language can be almost the same in spite of a different accent. Of course there will be any number of phrases and even grammar patterns that are different, but as long as you use standard spelling (including short forms) texts will be mutually understandable.

Yeah, unless you’re going to have specific needs, it’s best to not start with dialect.

So it sounds like everyone agrees that it’s best to have all texts in standard Finnish?

Although a ‘real’ native would be best to advise, in my opinion it’s not so much an issue of Standard Finnish vs Dialects, but more an issue of Standard Finnish vs Spoken Finnish (puhekieli).

I’d like to express some considerations as a previous (failing) learner of Finnish and as a future (more successful) learner of it once it is added on LingQ.

I think the beginner and intermediate lessons should be written and read in Standard Finnish (e.g. minä olen), while advanced one can be in “colloquial” Finnish. Learners of Finnish are not supposed to be able to speak or understand such colloquial forms until they become fluent. So, I wouldn’t study a beginner lesson read in colloquial Finnish and transcribed. Anyway, if the audio is in colloquial language, the text should be a transcription of it, not a translation into standard language.

I suggest you identify a few groups of regional accent and have them added alongside with Standard Finnish.

In Italian, we have really a lot of regional accents and languages (or dialects, but this is not the place for a debate). In some areas of the country, people usually speak dialect, not standard Italian, or use words from their dialect while speaking Italian. However, all the lessons in the Italian library are written in Italian. I have not and will not add any lesson in Venetian dialect, until it is added as beta language in a still far future. If I were a learner, I would like to learn to speak Italian, not one of its dialects.

Given that Standard Italian doesn’t exist in speaking, I suggested to add six different accent groups. I would suggest you do the same if Finnish is added.

Finding beginner material written/read in a dialect would confuse me, even if I have previous knowledge of the Finnish grammar, and I can’t imagine it would confuse an absolute beginner even more. And, as I wrote, I would not insert any dialectal forms unless it is labeled as “advanced”. This is what I think.

This is not so much a difference between standard and dialect, as in the distinction between standard Italian and “Italian dialects” but the realisation of the standard language in spoken form.

I’m all for standard language in writing. All languages (which I’ve had a look at) have “lazy” pronunciation to some extent, but (hopefully) people other than teens would never write exactly the way they’re speaking.


Another solution might be like what I believe exists for languages like Mandarin and Japanese: a transcription system. Standard written, spoken as Finnish is spoken, with a transcription of the spoken Finnish underneath (or wherever that goes).

Jeff, I fully agree with your concise post.

Ok. Assuming Finnish is added next, I won’t share the few somewhat colloquial beginner lessons that I have until a consensus is reached.

Peter, you could always share them as advanced lessons at some point. The whole issue is a bit of a headache, no doubt about that. Ideally all beginner materials should be in grammatically correct standard Finnish audio included. In colloquial language many grammatical features are only implicit, which increases the opacity of the language to beginners who have no way of inferring the underlying structure.

I could mark these lessons as Advanced or even Intermediate 2 (because they are really short and ‘simple’). Hopefully the LingQ staff will add an ‘accent’ category called ‘Colloquial’.

With regards to how the text should look, some have said that it should be the correct written form and others have said that it should be a transcription of what is said. It would be nice to get some feedback from LingQ, because, although this would only apply to advanced ‘colloquial’ lessons, it could seriously affect wordcounts, for example, as each word spelt differently will count as a new word. In the case of transcribed conversations, almost all of the Finns I’ve asked in the last few days have said that the transcription should match the audio. Obviously, these would then be marked as Advanced and Colloquial, so they shouldn’t interfere too much with any beginners studying Finnish.

Peter, please share the lessons I want to have a look at them.

Peter, I think you can share them. It won’t be difficult for the LingQ staff to add “Standard” and “Colloquial” as Accent options if you think they are better distinctions than local varieties (such as “Southern Finnish”, “Central Finnish”, if they ever exist). If lessons are marked as Colloquial, you could add them as Intermediate, I guess.
In any case, I expect the text of the lesson to be a transcription of the audio. If the audio is colloquial Finnish, you could provide a translation into standard Finnish in the “Translation” field, or add some lexical notes in the Notes field.