As I said, point 3 I’m not even sure myself. The idea is definitely welcomed as the more the better. Surely 1 minute format is very beginner level, 3 minutes would be interesting too.
Great that you fixed already text/audio match.
Now the point 1. Let’s see if I can explain myself better.
If you are a photographer, with noise they mean something specific right? More the grain coming from the sensor, or shooting at night, bla bla, etc. And they use a denoise tool to improve it in post-production, etc.
With audio recording is specific too when they talk about background noise. I don’t mean sounds like cars, breathing, etc. But the sound of the noise itself, I’m not sure if they call it “white noise”.
If you look online, maybe in your own language, what they mean about background noise in audio you’ll find more precise things that what I can give you. And also solutions.
That noise is both given by the equipment you’re using or the position where you are recording. From hearing it, it could be your equipment. Maybe, if you’re using a camera with a bad pre-amp, it could be a wrong microphone, or you might need a better amplifier, or you could just fix it with software. It could be even a silly thing that you can fix with 20€ but it’s worth it.
With “sound breaks” or gaps I mean this. You can really hear them with “Orange sind dumm”. This is the more evident.
Listen to your recording in between the sentences.
For example, in between “… Orange sein” BREAK “Ich gehe in…”
In this recording is more evident because almost in between every sentence there is this GAP. Basically it’s exactly the opposite of what I said before. You can clearly hear that there is no “white noise” background between when you stop talking and when you start talking again. There is nothing, emptiness, a space vacuum.
This is probably due to how your equipment is recording, some setting or feature but I’m not an expert on this stuff so I can’t point you in the right direction.
But the background “white noise” (reduced) needs to be the same for all your talking without those distracting empty gaps. In “Orange sind dumm” is more evident. You haven’t recorded them all in the same way, probably in this you have recorded more sentences separated from each other and then put them together via software. No idea.
But I hope you get the point, at least an idea.