Native Podcast Transcripts Project

I would like to propose a project to get native language podcasts transcribed and put onto lingq.

The outline proposal is; to gather a group of interested learners (for each target language) to pool a weekly fund of around $2-5/week each. Each week, or otherwise agreed time period, one group member can select a native podcast of their choice to be transcribed (taking in turn, each week, for every group member).

The transcription service would be advertised on a site like elance/upwork, and regular transcribers at these sites can bid for the weekly podcast transcription work (after a while a core group of transcribers could just be used, for each language). Typically the cost is $1/min of podcast, for a quality transcription.

So a weekly cost is around $30 for a 30 minute podcast. If this was split amongst 6 or more people, then the cost is around $5 week,or less for more members (or shorter podcasts).

In addition, the original podcast provider can be asked for permission to upload the podcast onto lingq. This would then serve the wider community.

The advantage for serious learners, participating (and paying) in the podcast selection group - would be (relatively) cheap podcast transcripts, of which they would be able to access all of the transcripts (even the ones where permission to make the transcripts available on lingq is not given). They would get access to a diverse and interesting range of current podcasts and transcripts, in their target language.

As a first step, interest from group participants is sought. If interested, just post below that you are interested in joining, and what the language you would like native podcast transcripts for.

From there, various language groups can form and organise amongst themselves.

Some basic rules/process would be:

  1. Post below, notifying your interest, and your target language.
  2. When 6 or more people for a target language is formed, commence a transcript group amongst yourselves.
  3. Have the first person nominate a preferred podcast (minimum 20 minutes in length). Then take in turns, every week, or otherwise agreed time. A good source for podcasts is; go to itunes, change your country location to your target language, click on podcasts, and then look at new and popular native language podcasts in your target language.
  4. Advertise the podcast transcript work on elance (or similar site - even interested transcribers from lingq could be used). A sample advert is as follows:

"LanguageX - Transcribe Podcast

I’m learning LanguageX and I’d like you to transcribe one episode of an itunes podcast for me - the episode is 30 minutes long.

The episode link is here: xxxx. [click on “get link” in itunes for the podcast link]

I need the entire audio transcribed, in LanguageX, using Microsoft Word. No translation is required.

Please submit your proposed price for the complete job."

  1. Each group member advertises, and pays for, their own transcript and distributes to the group when completed (typically it takes a day, or two, for bids to be received, and another day for a podcast to be transcribed). Where permission to upload onto lingq is received the podcast and transcript can be uploaded.

  2. If you want longer podcasts to be transcribed, one idea would be to break it into two parts. For example, for a 50 minute long podcast - advertise the first 25 mins, and then when your turn next arises, advertise the second 25 minutes.

  3. There is an element of trust required with this process. Namely that each group member will honor their turn, and the process. If a dispute arises, or a group member drops out - then the worst that will happen is that you have paid for a transcript and shared with someone who has not returned the favour. Not the end of the world, but I’m guessing things like this, or similar, could happen. However, there is a pretty strong community here, and I hope there would not be too much problem.

A bigger issue, may likely be, whether enough people will be interested in doing this.

I hope to hear from you, if you are interested.

EDIT UPDATE 2015: - these courses are an example of an end result of this process -

5 Likes

Iain, Thanks for this. I fully support the idea and want to think more about it what I can do to help. Meanwhile, just by commenting here it keeps the idea prominent on the forum.

1 Like

@iaing
I sometimes transcribe Portuguese podcasts and share them as public in the Portuguese Library here at Lingq. I don’t do it more often because it is too time consuming and there is no reward.
I’m not sure I completely understood your proposal, but how different is it from using the Exchange and requesting a transcription?

1 Like

Hello mfr, the key additional difference is probably around the reward and financing structure.

Groups could form together to pool money/points/resources to finance having transcripts done, thus minimizing the individual cost. So, rather than having one person requesting and paying for a transcript to be done, groups could form to reduce/share the costs.

Additionally, rather than having one person providing transcripts without much in return, the transcripter would be paid for their efforts. There are a number of people that generously give recordings and transcripts to lingq, but this model is limited to the few “saints” that do this.

Another possible difference, is, that by forming into groups, people would get transcripts on a broad range of interests. For example, Group member A may be interested in a podcast on pop culture, whilst Group member B may be interested in a comedy or science podcast. The group structure may also create a “serious learner group” effect whereby group members can encourage and share further within the language.

A further possible difference, is that this could be applied to any language. Currently, some languages are well supported with content, whereas the content in other languages is quite lacking. This isn’t an issue specific to lingq. It is quite hard to obtain a wide variety of native podcast transcripts for many languages, in general. Native speakers have no need for them, and it is mostly serious language learners that understand the value of having transcripts for everyday-spoken type material.

This is just a proposal to bridge that content gap, in a way that minimizes the cost imposed on a single learner, whilst providing additional reward to the transcript provider.

Like Fernanda I’ve done transcriptions of podcast, but not only a few, I’ve done a lot of them. My personal experience is that they are not taken as often as I thought. I absolutely agree with her that there is no reward in doing it.

So I like this idea about sharing the costs for a transcript among interested users :slight_smile: Then I would know that there is really interest in a topic, and that it is worth to spend the time working on a transcript.

It is defenitely not easy to implement, and you have to think a lot about it. Two points from my side:

  1. At the moment the rate for transcriptions that LingQ suggests on the Exchange is very low according to the time you have to put in. For all the other requests the LingQ suggestion is fine, but for transpriptions it is definitely much too low. That is why I’ve added a note on my profile: “For transcriptions I expect 250 points per minute, because the LingQ suggestion is too low.” Transpcriptions are very time consuming, especially if it is a podcast or a discussion with lots of texts. From my experience I need 7 minutes for one minute of Geman audio. Then I listen to it again and make corrections before I upload the text.

  2. Don’t forget to consider that these lessons have to stay ‘private’ if the source is copyrighted material.

@iaing
Thanks for clarifying your project.

It would be interesting if we could find the two groups, I mean learners interested in podcasts and content providers here among Lingq members.

Let´s wait and see if more people find your project feasible.

By the way, I have 37 podcast transcriptions. Does that make a “saint” out of me? :slight_smile:

1 Like

@Fernanda: I don’t know. I’ve done more than 500 podcasts transcripts :wink:

2 Likes

@Vera
In that case you are a super saint!
:))

Iain, I would like to support this initiative. If you can get permission from a podcaster, and if the sound quality is good, and if the content is interesting, we will pay for the transcript here at LingQ, and upload the lessons ourselves, otherwise I fear this initiative will die.

It suddenly dawned on me that I have already paid for transcripts for Korean and Romanian podcasts, which you can find in the library, so why not for other languages.

There are some issues. Typically, most learners don’t take advanced lessons, so the number of uses for these advanced podcasts is usually small, as Vera points out. It would be great if we can find intermediate level content as well as advanced content.

Another problem is our library, and how to make it easier to find things there. Currently our programmers are working on other issues, but I hope we can get to the library soon. Trying to come up with a system that makes things easy to find is complex. We have beginners, intermediate and advanced users; people who want to be told what to study and people who have specific interests. We have too much content in some languages, including poor quality content. I am reluctant to cull lessons that people have put a lot of effort into creating. I welcome any suggestions on what to do in the library, although we will not be doing anything in the immediate future.

For those, like Fernanda, and especially Vera, who have already put a great deal of effort into obtaining permission from podcasters and then transcribing them, it might seem unfair that we would now start paying for transcriptions. I can only say that LingQ has many more members now, and that means that circumstances have changed from even one or two years ago.

1 Like

I would kill to have current Russian Podcasts with transcripts that run in the 30-60 minute range and I would definitely pay more for that. The majority of listening I do is with Russian Radio (KP, Vesti, Svoboda, etc). There is something about listening to current topics on the radio that is far more interesting to me than listening to lessons. My Russian ability probably suffers because I spend too much time just listening to the radio over directed studying.

There is some stuff on the radio that I find relatively easy and some stuff I find ridiculously hard - especially news read blazing fast. Either way, I am constantly thinking, “I wish I had a transcript for that so I could really work through and understand what they said.” I just went through the 23 Lingq VoiceofAmerica lessons and loved them to the point that I went to VoA directly to see if they had any podcasts with transcripts, which they didn’t.

I find the content for Russian on Linq to be generally awesome and there is always material to be found. But there is something about current radio podcasts that truly appeals to me.

My 2 cents :slight_smile:

1 Like

David, do you use the material from Echo Moskvi? There is quite a variety of interviews there, all with audio and text, representing different points of view. For example, this morning you have an interview with a man in Moscow who is involved in organizing Russian volunteer soldiers to go to Eastern Ukraine,
http://www.echo.msk.ru/programs/razvorot/1337058-echo/

and another of a Russian journalist who follows the Ukrainian army in Eastern Ukraine.
http://www.echo.msk.ru/programs/razvorot/1337056-echo/

We are regularly uploading interviews from Echo, since we have permission to use them. If you see interviews that interest you, let us know and will try to upload them.

Hi Steve,

Wow, I haven’t seen those. I will definitely take a look. I think you alluded somewhere that it is difficult to search the library (especially in Russian where there is so much content). I was shocked when I found the VoA stuff and saw that it was loaded in 2009 since I have never seen them before in Lingq.

I am also surprised that I haven’t come across the Echo Moskvi stuff on the net before because I have searched often on the net for Russian Podcasts with transcripts.

Thanks for the info!!

@iaing

Wow! Did you ask Chinese native members to do this on LingQ Chinese forum?

@Steve

If I ask anyone who would like transcripts of my favorite podcast on LingQ forum and get enough people, is it possible for LingQ to consider making the contents and put them in its library?

You have to ask the provider of the podcast if they give you permission even if you (or someone else) did the transcript.

You can add any content to our library at LingQ or we can do it for you. The only thing is that you need the permission of the person who made the podcast as Vera says. However, you can have these podcasts transcribed and import them and use these for your own use, privately,without limit. You just can’t put them into the library without permission.

Maybe learners at LingQ are willing to do transcripts as a common project, but in any case permission from the podcaster is necessary before putting anything in the library. Cheers.

Well, I can record podcasts in Brazilian Portuguese for free, if someone give me a text to do it, the only problem is my poor microphone that changes my voice, but it is not a big problem anyway.

The idea with this thread is to identify people wishing to form groups in target languages - to take native, natural podcasts and create transcripts, for the users in each group.

It wasn’t to take non-natural transcripts and create a non-natural podcast. I’d suggest, perhaps, that should be for another thread discussion.

Where permission isn’t received, the group can still use and share the created transcripts, it just wouldn’t go back through LingQ in any way, or, at least, not through the LingQ publicly available library.

One of my real goals with this process is to just create groups of people interested in creating and sharing transcripts for cool podcasts in their target languages…and then have this spin off into chats and discussions about learning the target language, what podcasts are cool, what difficulties are being seen by members in the group, and also, have the group users support each other around all this etc.

As an aside, another goal is to put the focus back on quality content in the library. Which is LingQ’s biggest achilles heel, imo.

That is to say, LingQ is too much focused on the “system”, and not focused enough on getting quality content into the libraries, in my opinion.

If people wish to be involved in this and create something similar to this example course here: Login - LingQ , then just post in this thread that you are interested in forming a group and what your target language is - when 6 or more people are available just form a group and “just do it” (as the ad says).

4 Likes

"@iaing

Wow! Did you ask Chinese native members to do this on LingQ Chinese forum?"

@yukiko

No, I didn’t. My thinking was - the most cost effective way to produce transcripts is to use good transcribers (who have software and hardware to enable quick transcribing - slow down audio, easy means to pause and replay without taking hands off typing content on keyboard etc) - most LingQ members aren’t professional transcribers, and very few if any exist in LingQ’s Chinese user base - and - that would prepared to do this cost effectively.

As mentioned, transcribing could be done with LingQ users, and through the exchange, but issues with the exchange etc have been hashed out elsewhere, and LingQ just isn’t listening carefully to reasonable arguments here, imo.

Yes, I was aware of the permission issue from podcasters.

My question was:

If you can get permission from a podcaster, and if the sound quality is good, and if the content is interesting, we will pay for the transcript here at LingQ,

How many LingQ members would you need to have LingQ to pay for the transcripts?

As far as I know LingQ will not pay for the content transcripts. But if YOU want to share it publicly you need the permission. The one who shared the lesson can earn points.

But I warn you. The amount of points that you earn is not very high and it differs because it is related to the number of points that were not used within 3 month. At the moment the rate is usually between 0.5 and 0.7 points per taken lesson.

In my experience most users at higher levels prefer importing their own content (privately), and don’t use the lessons for advanced or high intermediate learners. So the effort to do this is never paying off. That is why I stopped doing this: why should I spend hours and hours if less than 50 or 100 people are interested in these lessons?