Ignore repeated prompt #163

heimoshuiyu · 2023-04-19T17:55:48Z

mayeaux · 2023-04-24T15:29:30Z

Doesn't seem to work for me, I tried it with a movie that starts with music and it still fell into a bad hallucination/failure loop 🤷‍♂️

Purfview · 2023-09-01T13:15:29Z

Works well, and the users reported positive experience with this PR.

This reverts commit 6e42088, reversing changes made to 4a59bb0.

realtechspecs · 2024-12-04T11:46:06Z

Doesn't seem to work for me, I tried it with a movie that starts with music and it still fell into a bad hallucination/failure loop 🤷‍♂️

Can you share the movie you tried it with

doko-desuka · 2025-06-27T05:23:27Z

A different style for that block but with the same behavior:

                max_repetitions = 1
                stripped_text = text.strip()
                recent_prompt_text = all_prompt_text[-max_repetitions:]
                if recent_prompt_text.count(stripped_text) < max_repetitions:
                    all_tokens.extend(tokens)
                    all_prompt_text.append(stripped_text)

...and max_repetitions could be made into a new transcription setting, for example.

But if someone only needs to check the latest text (i.e. max_repetitions=1), it can be reduced to this:

        last_decoded_text = None

        (...)

                # Don't accumulate tokens if they form the same text as the previous segment.
                stripped_text = text.strip()
                if last_decoded_text != stripped_text:
                    all_tokens.extend(tokens)
                    last_decoded_text = stripped_text

At first I thought what you wanted was already addressed by the repetition_penalty setting. But according to this great comment, that setting works on all tokens at once, it discourages the use of tokens even if the sentence is completely different. The (CPU) implementation is this btw.

So what you're doing is another type of penalty, discouraging repeating sentences/segment texts (formed by many tokens) rather than penalizing individual tokens. Very useful.

Edit: I guess kinda like a dynamic no_repeat_ngram_size, instead of avoiding ngrams of a specific size, you avoid anything that repeats.

heimoshuiyu added 2 commits April 20, 2023 01:49

Ignore repeated prompt

49af956

format code

9a646b6

This was referenced Jun 12, 2023

Word-level timestamps are very inaccurate #294

Closed

Shorter segments? #15

Closed

Merge branch 'master' into prompt

9f24e2c

heimoshuiyu and others added 12 commits December 25, 2023 17:56

Merge remote-tracking branch 'upstream/master' into prompt

b835bda

Merge branch 'master' into prompt

d04e685

Merge branch 'master' into prompt

4b64ef1

Merge remote-tracking branch 'upstream/master' into prompt

e50d82c

Merge branch 'master' into prompt

4ee1d54

Merge remote-tracking branch 'upstream/master' into prompt

4a59bb0

Merge remote-tracking branch 'upstream/master' into prompt

6e42088

Revert "Merge remote-tracking branch 'upstream/master' into prompt"

28a4d11

This reverts commit 6e42088, reversing changes made to 4a59bb0.

Update Dockerfile to ensure compatibility with CT2==4.5.0

9bbca4b

Add support for turbo model (SYSTRAN#1090)

8eb0fc0

typo: trubo -> turbo

8563f88

Merge remote-tracking branch 'upstream/master' into prompt

a759f5f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ignore repeated prompt #163

Ignore repeated prompt #163

Uh oh!

heimoshuiyu commented Apr 19, 2023

Uh oh!

mayeaux commented Apr 24, 2023

Uh oh!

Purfview commented Sep 1, 2023

Uh oh!

realtechspecs commented Dec 4, 2024

Uh oh!

doko-desuka commented Jun 27, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Ignore repeated prompt #163

Are you sure you want to change the base?

Ignore repeated prompt #163

Uh oh!

Conversation

heimoshuiyu commented Apr 19, 2023

Uh oh!

mayeaux commented Apr 24, 2023

Uh oh!

Purfview commented Sep 1, 2023

Uh oh!

realtechspecs commented Dec 4, 2024

Uh oh!

doko-desuka commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

doko-desuka commented Jun 27, 2025 •

edited

Loading