Hello team,
One of the most annoying things with using LingQ for Arabic is that in cases where the reading of a word is contextually or lexically defined (as opposed to gramatically deduceable by parsing its PoS and deducing its reading from the regular schemes), I can only rely on the automatic TTS, which I find very annoying to use (it interrupts my workflow and sometimes I literally cannot listen to it) and regularly makes mistakes. This is especially true in cases where there are several possible readings for a given spelling.
Ex: جمل , as a noun, can be vocalized “jamal” (a camel) or “jumla” (a sentence).
شغل ,as a noun, can be vocalized “shughul” (work, occupation) or shaghl (distraction).
Basically, whenever I stumble upon a simple three letter words, I have to guess how to read it, and the TTS function might not be helpful because it can only enunciate one possiblity.
What I imagine would be nice is if when clicking on the word in a Lesson, one would get, in the case of شغل , two entries:
شَغْل - distraction
شُغُل - word, occupation
which would let the learner not have to guess the reading, and notice the ambiguity.
I suppose that the automatic AI feature used when searching vocabulary is based on a RAG-augmented LLM. Perhaps it could be tweaked to:
- PoS tag the word (including stripping it of enclitics) ;
- Then look for all the word’s possible readings with full or semi-diacritization;
- Then provide the learner with the diacritized lexical items corresponding and their translation