Video translation has always been a very tedious task. Compared to text and image translation, video translation is multimodal content, which includes information such as speech, images, music, stickers, and subtitles, which may vary in different times and spaces. The difficulty of video translation is tens of times that of image translation, and the cost is hundreds of times that of text translation. Below we will outline the process of video translation and take a closer look at why video translation is both difficult and expensive.
When we want to translate a video, there is a lot of work that needs to be done here.
Let's take an example of a English video translated into an Japanese video and see what actions are required.
steps | software | Need to do | cost | difficult |
---|---|---|---|---|
1 | No watermark video download | Download the video | yes | |
2 | Separation of sound and background music | Extracting video commentarySeparate background music | yes | Background sound not separated cleanly |
3 | ASR | Convert sound to text | yes | Sound to text error |
4 | Translation | Translation of text into the target language | yes | Translation errors can occurDifficult to correct errors in small languages |
5 | TTS | Sound synthesis in the language being translated | yes | Voices do not sound good |
6 | Video editing | Removal of the original soundRemoval of original subtitlesAlignment of new sound and subtitles to pictureVideo compositing | yes | Voices are too long or too shortSubtitles too long or too shortTime-consuming to alignRequires professional editing skills |
There are a lot of details about the final composition, the alignment of the picture, sound and subtitle files and the processing of the material
In the original video, the sound, subtitles and picture were basically aligned, i.e. when a scene was mentioned, the narration and subtitles fell exactly under the current scene. As different languages have different lengths of translation for the same word and different pronunciation lengths, careful proofreading and adjustment is required to keep the sound and subtitles aligned with the original picture.
The original subtitles may be embedded in the video, so if you want to display the new subtitles, you will need to erase the original subtitles. Here too, editing skills are required.
and It’s very hard to remove the hard subtitles with traditional ways.
That said, if we translate a particular video and do it manually, the workload here would be very high, and there would be a bunch of software to buy and learn, costing more money and time.
Is there a software that can do a direct translation of a video, extracting, translating, correcting, cropping, aligning, subtitle erasure and so on, and at the same time humanely support fine-tuning of subtitles and speech? I am happy to be able to recommend this software, called GhostCut, which has served more than 1000,000 customers and is well received. He does voice extraction, translation, error correction, dubbing and alignment of videos, etc. through AI technology.
At GhostCut, we confidently provide two types of AI video translation products: one that translates through the original video's audio (ASR) and another through the original video's text (OCR). We highly recommend that you understand the differences between these two products before making your choice so that you can select the one that best suits your translation needs.