Spam lessons in the library

As some of you may know I have created a lot of content for the Spanish library. All of it (right now 194 lessons) is original content that I have personally created. I have a lot of collections and I have spent a lot of time and effort to create this content.
The number one content creator in the library (or at least the most used) is or was Vera. She is doing an amazing work for German. Now that I’ve started learning German I really appreciate her content. Sure, there are many other people creating really interesting and exclusive content for the library.

However today I’ve ben greatly disappointed. In the last provider awards list I was surprised to see that suddently someone completely new (two months ago he didn’t even appear in the list) was number one.

I didn’t paid a lot of attention to that provider until today, when I was uploading a new lesson to the Spanish library. Yesterday I was number one (194 lessons) and number two was SpanishLingq with 136 lessons. Today, however, this provider was number 2 with 193 lessons. He has uploaded, just for Spanish, 180 lessons in a day!!!
I was completely surprised because I know how time consuming is creating lessons, so I checked this provider out and I’ve found that in a few months he has created 5929 lessons!!! That’s impossible!! Well, no! He is just copy pasting text and audio from Librivox, saturating the library.

I’ve checked some of this content out and there are several big mistakes. First of all, the “Fábulas de Esopo” (the 180 spanish lessons) are definitely not a Beginner 1 content (which is how he has rate them knowing that beginner content is the most popular in the library) and, the worst and, in my opinion, something completely unacceptable is that some of the lessons are not recorded by Spanish native speakers but by foreign students with a really poor accent and pronunciation.

I’m sorry, but I’m really annoyed and disappointed. Usually in the library, you could find free content like librivox (with no interest for me because you can find it anywere, you don’t need LingQ for that, but I understand some people liked) and original exclusive content created by members. Right now, this interesting content, which in my opinion, was the value of the library, is being buried by these spam lessons. I’m sorry, but I can’t find any other word to describe this kind of low quality lessons.

I’m very disappointed because I’ve done an effort in trying to create a lot of interesting and diverse content for the library and suddently someone who has done… nothing, is getting more points than me and other active providers because he has saturated the library with this ***.

I know technically is not LingQ’s fault, but I think that this should be urgently controlled, both the quality of the lessons and the spamming thing to just get points.

Probably the Exclusive content tag that proposed Vera some time ago or any other way to promote original content as well as giving more points to original content providers should be applied as soon as possible.

But anyway, with this specific case I think that something must be done or we’ll loose most of the LingQ library’s value buried with all this spam.

As for me, right now I’m really frustrated and I’ve lost any motivation to keep on creating content for the Spanish library. I’m sorry for those who like my content, but at least for a while, I’ll stop creating lessons for LingQ.

OH MY GOD!! THIS IS LIKE A VIRUS!! I was trying to see the use of Spanish content among my friends, in the Friends tab (where you see what people is doing in your language) and to my horror I’ve seen that is was also satured with new lessons from this provider: 94 more lessons that still don’t appear in the total numbers of the library because it takes some time!! This means that tomorrow morning there’ll be at least 290 “lessons” from this provider (in just one day!!)
It’s like a virus spreading really fast. I think it’s really serious! In a few weeks it can compromise the entire library converting LingQ in a Librivox copy!!

I apologize for my forcefulness, but I’m really angry and disappointed!

I agree with you that the libraries are being flooded by lessons copied over from Librivox etc. I also think that some sort of quality control might be appropriate. I would think it a great pity if you were not to continue your work for the LingQ library. Your material is of an outstanding quality and we would all be the poorer for being without it. I do know how time-consuming it is to create new content (and I don’t even provide video material) and you have worked wonders for the Spanish library, just as Berta and Oscar have been doing with their podcasts. The ‘exclusive’ tag seems more and more sensible to establish. (Although I do n’t know how much work it would be for LingQ to create a parallel library, so to speak.) Please know that your efforts are GREATLY appreciated!

After looking at the Friends page and seeing 5 pages full of lessons being shared by the same member, the necessity of some moderation seems much more clear.

We’ve been talking behind the scenes about the idea that Vera proposed (LingQ Exclusive) and are very interested in implementing something like that. We don’t want to hinder the flow of new content into the library, but we definitely value original content over any other type of content, and we’ll make sure to highlight this as we work towards a solution for this problem.

In the meantime, I know that people who learn from your content will continue to do so. They won’t replace your original and genuine lessons with audiobooks, you know that. Your content is very valuable in our eyes, and we will make sure that content like yours, like Vera’s, and others on the site is recognized beyond the other unoriginal content that is in the Library.

There are no regulations on sharing materials from Librivox, so you cannot blame anyone for doing that. But I don’ t hope that “Bad money drives out good if their exchange rate is set by law.”

We do have an “exclusive” identifier in the works and a way to feature this content in the Library. At the same time, we recognize that there has been a flood of Librivox content in to the Library recently. As tora3 says, you can’t blame the provider and there are people who do like this kind of content. However, we do value the LingQ original content most of all. Give us some time to decide how to deal with this issue. We do agree that non-native speaker recorded audiobooks are not desirable for most. We will be improving the library with an Exclusive shelf and ratings over the next little while. Let us see how this improves things and then we can look at what else we can do.

I agree that something needs to be done and it will be done.

I also do not want non native recordings in the library.

Yet there are people who enjoy these audio books so a way will be found to give proper prominence to our own member created content, while continuing to enable members to upload audio books (read by native speakers).

We have a few ideas that should deal with this problem. Unfortunately the wheels turn slowly at LingQ HQ because once we have promming resources focused on one area we do not like to pull them off and direct them to other issues.

But let us see what can be done as soon as possible.

Hi Mark, hi Alex, I know it is not your fault. But you are the ones who are able to solve this problem. We cannot to this.

Thank you Albert for bringing this issue up again. I thought I’m the only one recognizing this, but I didn’t want to be the only one complaining about this. I didn’t want to be known as the one who is always complaining. We had this problem with content from VOA since a year (maybe longer), and now with Librivox in a lot more aggressive way. Recently content for other languages besides English is uploaded from Librivox. This user started to add content to the German library as well. The idea to stop creating my own content and stop transcribing podcasts come to me as well.

Adding so much content from Librivox from one user brings a lot of problems with it. One problem is that the library will end up in a mess. I’m glad that you picked up my idea of the “exclusive” attribute. The “exclusive” attribute will help at this point. Preferably I would love to have another attribute: “Content not from VOA or Librivox” because I think we have some excellent content not from Librivox and not from VOA that should be easy to find. (for example podcasts that I’ve transcribed by hand!)

Another problem especially for German (maybe for other languages too) are the spelling reforms that we had. Content from Librivox is more than 70 years old regarding the copyright laws. The spelling is quite different from today’s spelling. I uploaded once content from Librivox to the German library but it was a lot of work to correct the text. And I add some remarks about “dead” words (words that are no longer in use). I’m a German editor but I’m not willing to correct content added in such a way.

Also it could turn out into a problem that people are no longer willing to invest time to create new material or to find material on the web. OK, you find Librivox on the web too, but to find other sources is quite difficult, and you have to email with the provider.

What I did is transcribing a lot of podcasts because texts are seldom provided for German podcasts. Maybe you can imagine how much work this is.

I think here is someone misusing the system. The “exclusive” attribute that I suggested will solve a part of the problem. I fear, to stop this you have to rethink the rewarding system. An idea is not only to count how often a lesson is used. You should rate the quality and quantity too.

For example:
1 value point for Content which is uploaded by “Copy&paste” from VOA or Librivox.
2 value points for Content which is uploaded by “Copy&paste” from other sources (more difficult to find and asking the provider for permission) with spelling correction etc.
3 value points for Content shorter than 1 minute which is transcribed by hand or recorded by a member
4 value points for Content longer than 1 minute and shorter than 10 minutes which is transcribed by hand or recorded by a member
5 value points for Content longer than 10 minutes which is transcribed by hand or recorded by a member
6 value points for Content shorter than 1 minute which is created by a member
7 value points for Content longer than 1 minute and shorter than 3 minutes which is created by a member
8 value points for Content longer than 3 minutes and shorter than 6 minutes which is created by a member
9 value points for Content longer than 6 minutes and shorter than 10 minutes which is created by a member
10 value points for Content longer than 10 minutes which is created by a member

Multiply value points with the counter how often content is used and use the result to determine how points are given to the providers of content.

Sounds complicated? Yes, I’m a software developer and I can imagine that this is not easy to do. But such a system would be fairer. And people like a system that sounds fair I know that it is not easy to do. Maybe it is not worth to do it in your mind. But maybe members are thinking creating content is not longer worth it. And that would be the worst case. The library was the point that made me stay at LingQ when I found it! It is very important to make new members staying because it makes (it made) so obvious how great and outstanding LingQ is.

Other ideas are welcome! Sorry for the long post.

Although I am not the member you are talking about, because it would be impossible for me to add so many lessons, I feel a bit guilty for being uploading an Italian audiobook from Librivox right now. I thought it would be interesting and useful for the Italian learners, considering it’s not the most known Italian novel. Should I stop uploading them?
Sure, it takes less time to upload a lesson from Librivox than writing and recording lessons on your own, but my aim is clearly not to flood the Italian library (quite the opposite, to enrich it).
I am in favour of an “exclusive” label and I find Vera’s suggestions interesting.

No mikebond, please continue to provide content of interest to our members. We will find a solution to the problem of how to feature our exclusive content.

Many good ideas Vera. Certainly there are issues here. Uploading a bunch of one minute content items from Librivox should not be worth the same us creating 5 minute original content, or transcribing longer items.

The question is should we reward people based on the effort involved, or on the degree of interest of the content. Not all member created content is more valuable than content that is simply copied from elsewhere.

A grading system with comments will also help, and that is something that we also want to get to.

It is a complicated issue. We will probably be able to offer the exclusive shelf, and a few other steps to prevent the flooding of certain shelves in the library first. Differential points is a little more complicated.

For the time being, I have contacted Mikola and suggested that he comment here.

I have also asked him to change the level to Intermediate.

“Far From the Madding Crowd” by Thomas Hardy Far From the Madding Crowd by Thomas Hardy - LingQ Language Library
We can read Thomas Hardy’s novels here on LingQ. I love reading his sad stories.

@Vera
Even if the content is from other sources, it can be a tedious work: improve the audio, check or even create the script and maybe a translation. The multipliers of 2 to 5 do by far not account for the work involved. And nobody can tell you the exact amount of work within an upload, so your approach is more than doubtful…

@ Vera: This is off-topic, but arising from your post. Great comments!

There’s one thing I don’t agree with in your reply to Mark and Alex:

In my opinion we should not ‘correct’ old-style spelling conventions - these books were published at a time when other rules ‘ruled’. They provide a look into the history of the written word. I know that a less severe reform has also taken place in Spanish and I cannot imagine anyone wanting to ‘correct’ their old orthography in the respective books.

(I am not saying we should go back to the first edition’s spelling, but rather accept the norm of the edition we are reading/importing.)

Hi, I thought that sharing books this way in a library was a good thing. But it backfired. Anyway, I back off. I am really sorry and apologize for the inconvenience.

2alsuvi
I do apologize before you. It is not correct that you have written about my motives. On one hand I have to learn English in order to immigrate to Canada, on the other hand, it is NOT my main motivation for learning the language. By the same token, I do like to get points for my shared content (my own lessons in the Russian library as well as LinbriVox recordings), but it is not my main motivation for sharing that lessons. Again, Alsuve, I hope that you will continue creating Spanish lessons for LingQ library. I had to stop learning Spanish (because I have to study for IELTS), but I find content creating by members quite valuable (for example, Berta’s content). It is a bad thing you wrote that stuff without personally knowing me. Should I visit Spain next year, I’ll be glad to meet you.
I will write Mark later today and make him exclude me from Sharing Points Award. I value people above everyting else. And, again, Alsuvi, I value what you have done/do (even though I do not know who you are and only base my opinion on your activities at LingQ), and I hope you will continue. I do not want any fights over points and you made a important point.

I hope Mark and Steve will find a way to resolve this issue and Vera made some valuable comments.

@tora3: No question that some of the books are interesting. That is not the reason for the complain.

@mikebond and @hape: If someone uploads 1, 2, 10 or 20 books, and does this carefully (improve the audio, check or even create the script and maybe a translation) it is something else. I uploaded one novel from Librivox by my own. We are speaking from someone who uploaded more than 6.000 lessons in 2 months! Nobody can tell me that this is done “carecully”.

Please make not the error to distract from the main problem here: a mess in the library and discouraging members who work really hard on creating lessons on their own and transcribing by hand. I’ve done all kind of these things, and please believe me that it is a huge difference even if you do the import carefully.

@SanneT: I think students want to learn a proper spelling. It can be very embarrassing if you have to write something, and you do a lot of spelling errors. Don’t forget that students learn German because they need it for school or for there job. Is someone is a ‘linguist’ in academically way he or she will know that there are differences and looking for an original version. We are no universal professors studying old fashioned language at LingQ and to correct the spelling is nothing that hurt the sense or the beauty of a novel.

Mikola,

We do not want to discourage people from providing content. What you have done is of great worth to our community. We just have to find a way to balance everyone’s interests and concerns. That is our job at LingQ. You have not done anything wrong.

As I said on your wall, I would appreciate if you adjusted the level of this content, though!

@mikola: Thank you for commenting here. We wrote at the same time. It is great that you create own content. This is great, and this is the best way to help LingQ and to make LingQ attractive.

I agree with Sanne. I think that most learners do not much notice these spelling differences. Certainly I don’t. I have old German books and new German books and have yet to notice any difference except for the funny looking backward “b” which really is double “s” and does not show up any more, and I kind of got used, whether it is there or not.

No big deal. All learners need to be able to deal with variations, imperfections and whatever else the language, even from native speakers, can throw at us as learners, IMHO. It is like walking in the forest. It is better to have the odd root to trip over, and twig that slaps you in the face. It makes you more alert.

I also agree with Susanne about spelling. All the audiobooks I have uploaded and am uploading feature some Italian forms that are no longer used or usable today, but I wouldn’t dare replace them with contemporary forms.