How Does ChatGPT Performs With Long Translations?

davideroccato · August 8, 2023, 8:25pm

I’m curious to know if you have tried to translate long articles, long paragraphs or part of books, took the time to read through the results, checking on details and so on.

ChatGPT or other similar softwares based on those AI.

How do they perform? Are they reliable? Would you publish publicly those translations?
What are the main problems you have faced?

Would you trust it to translate an entire email, for example, in a language you don’t know and send it? Let’s say in the most common languages that these machines can handle.

Any feedback is welcome, thanks.

jonas_fkall · August 8, 2023, 11:48pm

GPT is a pre-trained transformer AI, which means it has a large data in their background to get a great accuracy in its response, however, the more specific language you ask it to translate more is the chance that ChatGPT generate a random answer (which we called hallucination). I’d first try to train the AI (we can do that) with reliable translated text to the language you want, if you have more than one it would be great because you’d add an extra layer of data to ChatGPT works with. Also, I’d double check the translation with BARD AI (google’s GPT AI). I hope that helps you.

roosterburton · August 9, 2023, 12:50am

GPT4 does all my translations for shared lessons these days. It’s also very good at defining individual words, sometimes even better than what you would find on the Wiki or dictionaries.

What you can do to verify a result is translate the text to your native language and then translate it back to the original language using the same process.

as part of a generating process, you can refine your responses to basically always guarantee there are no hallucinations.

Here is a stripped down script that would ask GPT to clarify its response if it returned English.

import openai #GPT
from langdetect import detect #English detection module

openai.api_key = yourapikey #get from their website

# Function to translate a text
def translate_text(text, input_language, target_language, translation_type, model='gpt-3.5-turbo-16k', temperature=0.5, max_tokens=8000):
    # Forming the translation prompt
    if translation_type == 'Literal':
        prompt = f'Translate the following {input_language} text to {target_language} literally: "{text}"'
    else: # 'derived'
        prompt = f'Translate the following {input_language} text to {target_language} idiomatically: "{text}"'

   # Using OpenAI's API to generate a response
    retries = 5
    for i in range(retries):
        try:
            response = openai.ChatCompletion.create(
            model=model,
            messages=[
                {"role": "system", "content": prompt},
            ],
            max_tokens=max_tokens,
            temperature=temperature,
        )
            message = response['choices'][0]['message']['content']

            # New code to check if message is empty or None
            if not message:
                print('The message is empty or None')
                return 'Default Message'  # or some other default action

            # Check if the target language is not English and the translated text is in English
            if target_language.lower() != 'english' and detect(message) == 'en':
                if i < retries - 1:  # i is zero indexed
                    time.sleep(5)  # wait for 5 seconds before trying to fetch the posts again
                    continue
                else:
                    print('Maximum retries reached. The translated text is still in English.')
                    return 'Default Message'
            else:
                return message
        except openai.error.OpenAIError as e:
            if i < retries - 1:  # i is zero indexed
                time.sleep(5)  # wait for 5 seconds before trying to fetch the posts again
                continue
            else:
                print('Maximum retries reached due to OpenAI Error.')
                return 'Default Message'


//How to call
translate_text(text, input_language, target_language, translation_type)

This code needs optimizing but would work as a good base, if someone wanted to adopt it you would likely want to look at modifying the prompt slightly if there are English responses and some custom method to verify the output. (Other translators / Native looking at it etc).

davideroccato · August 9, 2023, 1:01pm

@jonas_fkall : how would you easily do that? I also think that it would be great to train it to ones specific writing style.

davideroccato · August 9, 2023, 1:06pm

@roosterburton so, you are saying that GPT4 is very good at it.
I don’t have any experience in coding and I’m not interested to dedicate time to it. What about using the simple GPT4 platform or GPT3.5 eventually?
Would you get the same reliable results?
I know, for example, that platform like Poe allows to create your own bot inside their platform, based on GPT or Claude, or others. But I don’t know how one could actually train it very effectively.