Think this will really help with my Polish studies and hopefully the tools I produce will be useful to others too. Here is a short description of the project I am working on.
Some scripts to apply spaCy natural language processing to target language content to help adult foreign language learners get to grips with their target languages’ grammar.
The idea is to take content that a learner is already familiar with and analyse the grammatical features of this content with spaCy.
We will take spaCy’s output and produce a csv file that can be imported into a SRS such as Anki so that the learner can for example then expose themselves to categories of grammar such as:
lots of examples of uses of a particular base form of a verb, noun, adjective etc with different cases, tenses, genders, in singular or plural etc. etc.
OR examples of masculine plural nouns, in whatever case.
examples of verbs in the third person, plural vs singular inflection.
etc. etc.
By mining familiar content and categorising sentences according to their grammatical features we allow the learner to expose themselves to lots of examples of the grammatical feature they are interested in, in a progressive systematic way according to their whim, which we feel will help them to become familiar with the grammar patterns of their target language in as efficient a way as possible.
Great that looks straight forward enough. I’ll just set it up as a private API for my tools.
Integrating the data in a meaningful way will still be a big challenge
“Integrating the data in a meaningful way will still be a big challenge”
I’m excited to see what you come up with Dan! Definitely a big UI challenge and speaking for myself it is not immediately obvious what all the data you can get from spaCy means for me as a language learner.
I think that this tool spacy is indeed a VERY helpful tool to have as a language learner.
I am particularly interested in the morphology of words it reveals. Quote I found in the spaCy documentation:
" We say that a lemma (root form) is inflected (modified/combined) with one or more morphological features to create a surface form."
But this is just one use of spacy.
Let me know if you want to jump onto a video / audio call to discuss possibilities and / or bounce ideas off each other about what this could be used for.
Just started getting into it but this word dependency system seems extremely useful. The other straight forward thing I can see is being able to group words based on these tags.