I’m experiencing what looks like a parser regression for French apostrophes, and it’s affecting existing content as well.
Previously, words like “c’est” were recognized and selectable as a single unit. Recently, in the same texts, LingQ now consistently splits them into “c” + “est” when creating a LingQ.
More importantly, this change also affects existing vocabulary:
-
I already have LingQs like “C’est” saved
-
These LingQs no longer match or get recognized in texts
-
They can’t be recreated in the same form anymore
Additionally, there are selection issues with some elided forms:
-
For example “l’autoroute” is sometimes not selectable at all
-
The text itself is correct and unchanged
Important details:
-
This happens across multiple devices
-
The apostrophes in the text are correct
-
The same texts were parsed differently before
-
This strongly suggests a server-side/tokenizer change, not a local or formatting issue
I understand that function words may be split intentionally, but the current behavior:
-
breaks matching with existing LingQs
-
creates unselectable words in some cases
-
causes inconsistency within the same language