Principles of Hard Sub Translation

Let's take a look at a real case of using GhostCut to translate the hardcode subtitles:

This function uses OCR technology to recognize the "TEXT" in the original video and generate translated subtitles.

The principle of translating video text usually involves the following steps:

  1. OCR recognition of text content: Analyze the video and recognize all text areas and content within the video through OCR;
  2. Recognize subtitle style and position: Record the location of each word, extract text style, and record the position information of the original text;
  3. Translate subtitles: Translate word by word and sentence by sentence based on the original video content through engines such as ChatGPT and Deepl, and try to avoid mistranslation or omission.
  4. Original subtitle position inpainting: Inpaint/remove the original subtitle position to remove the original subtitle content;
  5. Embed subtitles and synthesize video: Embed the edited subtitle file into the video, fill it in according to the original video subtitle position, and try to restore the original subtitle style, and calculate the arrangement of multi-line text. Then make appropriate adjustments and effect optimizations to ensure the clarity and overall effect of the subtitles.
  6. After the above processing, the video has embedded subtitles and has been translated.

GhostCut hard Subtitle Translation recognizes the "text" in the original video by OCR and generates translated text. It also uses AI to remove the original video text, and then re-pastes the translated text back into the original text position, while trying to retain the original text's size, color, and layout as much as possible. Compared with using multiple editing and translation software, the Specialty of GhostCut are: fully automatic, one-click removal of original text, and retention of the original video text layout.

Note:

The original hardcoded text and translated text can be modified in GhostCut, and this tutorial provides instructions for doing so.

If the original video only has audio narration and no subtitles, please use the "Auto Video Translation and Dubbing" function.

What factors affect the effectiveness of video text translation?

AI is used for video translation, including original text extraction, style extraction, subtitle translation, error correction, alignment calculations for position, style, and size, etc. Here are some factors that can affect accuracy:

  1. Background color interference - if the original video has fixed text and changing background, extracting the background color as text style is possible, so a single color background can be extracted more accurately.