rule

oplkill@lemmy.world · 5 days ago

Nope, they still not good. I using YouTube auto gen subs and they 100% need LLM to fix mistakes.

AnarchoEngineer@lemmy.dbzer0.com · 5 days ago

Large language models are designed to generate text based on previous text. Translation from audio to text can be done via a neural net but it isn’t a Large Language Model.

Now, could you combine the two to say reduce error on words that were mumbled by having a generative model predict the words that would fit better in that unclear sentence. However you could likely get away with a much smaller and faster net than an LLM in fact you might be able to get away with using plain-Jane markov chains, no machine learning necessary.

Point is that there is a difference between LLMs and other neural nets that produce text.

In the case of audio to text translation, using an LLM would be very inefficient and slow (possibly to the point it isn’t able to keep up with the audio at all), and using a very basic text generation net or even just a probabilistic algorithm would likely do the job just fine.

Ziglin (it/they)@lemmy.world · 5 days ago

How would an llm fix a mistake equivalent to something being misheard? I feel like you’re misunderstanding something and could probably also use some help with your English.

Bad Jojo@lemmy.blahaj.zone · 4 days ago

Be nice (Rule 2).

Ziglin (it/they)@lemmy.world · edit-2 4 days ago

Yeah, fair enough. I really did a bad job pointing that out politely.

In hindsight, trying to fix it I think I was trying to connect two thoughts I had about the other comment in a way that was not discernible in any way by anyone other than me.

Norah (pup/it/she)@lemmy.blahaj.zone · 4 days ago

[…]could probably also use some help with your English.

what the actual fluff is up with lemmy.world accounts in this thread acting like jerks?

Lily [she/her, pup/pup's]@lemmy.blahaj.zone · 4 days ago

lemmy.world accounts acting like jerks

many such cases