A tool for getting to grips with the Polish case system - could be used for other languages with complex grammars

Hi,

I’ve been thinking about how to approach getting a grasp of the Polish case system. And I’ve developed a tool that I think will help me.

I’ve used a Python library called spacy to analyse the grammar of all the lingq mini-stories and output them as one long html page with highlighting so you can instantly see:

  • the gender and for male words the animacy of every word.
  • the case of the word.
  • whether the word is a plural.

Hovering over a word you see more of the details of the word including the lemmatized form of the word, part of speech, gender, case …

I hope that I can work this into my study so that it can help me recognise where cases are used and become accustomed to the way the case endings are constructed. I figure I will continue doing extensive reading mainly for comprehension but also while doing my reading begin to also spend some time looking at the word endings. Have started with accusative, focusing on the green highlighted words. And then move on to the instrumental…

The code I have used is here: lingq/githubactions/highlight_cases_of_linqq_course.py at main · jamiepratt/lingq · GitHub

Jamie

4 Likes

Wow! This is so cool. You’re definitely onto something! This reminds me of what Steve Ridout did to help himself learn/study Spanish. The tool he developed for himself for Spanish ultimately led him to create a reading based platform that can be used by language learners in over a dozen languages.

I’m looking for another language to learn, a natural language vs a manufactured/engineered language. I want something with an alphabet and with very consistent sound values for each letter or each letter combination. Russian, Arabic, German, Korean, maybe Norwegin or Finnish…I’m trying to remember my “short” list off the top of my head. I think German has “cases”. I’m not sure about the others. My next step is to listen to a few of these phonemic or not-like-English langauges and see which one I like the sound of and am able to imitate (shadow).

Best of continued success with the development of the tool that you described in your post! It sounds super useful!

2 Likes

Great work! I think LingQ will have to implement something like Spacy at one point to make their grammar tags useable. Currently the grammar tags are woefully inadequate as they are just global dictionary forms and don’t take the sentence context into account. I sometimes see words in Italian or Portuguese that are noun, verb and adjective at the same time according to LingQ at least - not very helpful.

3 Likes

I believe that German, Russian, Finnish and probably Norwegian too has cases. Although in English we use the order of the words in the sentence to know what is the subject, object and direct object of the sentence we still have some small remnants of a case system think of mine vs. my. Latin has a case system and I guess our remnants of cases are a carry over from there?? Language is fascinating stuff.

2 Likes

Spacy requires 550 MB of neural network weights for it’s AI to identify the part of speech and lemmatize the word, correctly over 98% of the time or so for Polish. It’d be so wonderful to have something like this built into Lingq and to have a separate Lingq depending on what part of speech the word is depending on the context it is found in the sentence. And to have the system recognise that groups of words it is tracking and defining are all inflections of one root word so that we could have better tracking of known words and help students make connections between forms of a word.

2 Likes

The same approach can be used for other languages. If someone wants to do something similar for a publicly available course in another language where this would be helpful to learners then let me know. You would need to suggest css styles useful for highlighting grammatical features in a way that would be helpful to learners.

Here is the stylesheet for Polish for example:

1 Like

I am getting to grips with the German case system. My method is to put example sentences into Anki. Thus:

Ich habe ein großes grünes Auto
Ich gebe das Geschenk meinem Sohn
Ich mag den blauen Stift

I don’t know if this is an optimal method, but it is working, so I’m happy.

1 Like

Yeah, I was thinking of doing similar with anki. Maybe with a cloze type card where you have to type in the endings.

Good luck with German!

1 Like

My tool is evolving. I now have interactive transcripts of audio in my target language Polish with grammar analysis and highlighting:

https://jamiepratt.github.io/hyperaudio-lite/Daily%20Polish%20Story.html#hypertranscript6=0,305

2 Likes

The website looks really good. The instant mouseover tooltips, icon changes and underline are good touches.

1 Like