Home » Can AI repair sound and picture, but also memory? | Leifeng Network

Can AI repair sound and picture, but also memory? | Leifeng Network

by admin
Can AI repair sound and picture, but also memory? | Leifeng Network




Author丨He Sisi

Editor丨Lin Juemin

There is a feeling called Beyond. When Huang Jiaju opened his mouth, whose DNA was moved?

At 19:00 on July 3, “Beyond Live 1991 Life Contact Concert Selection and Commemorative Concert Selection Ultra-HD Restoration Version” was re-screened on Douyin, Xigua Video, Toutiao and other platforms. After 31 years, Beyond once again entered the public eye.

The phrase “Today there are only the remains of the body to welcome the glorious years and embrace freedom in the wind and rain” instantly brought people back to the “Beyond Live 1991 Life Contact Concert” held by Beyond at the Hung Hom Stadium in Hong Kong, China in September 1991. One of Beyond’s most-played concerts.

This restored version of the concert, from “Glorious Years”, “Broad Oceans and Sky” to “No Hesitation”… One after another classic songs, attracted 140 million online views. At the end of the event, many people wrote messages such as “I haven’t seen enough, replay it again, where can I download the HD version of the video” in the comment area, paying tribute to the classics and paying tribute to Beyond.

In addition to being full of feelings and memories, in this concert, you can feel that whether it is picture clarity, color saturation, or sound quality noise reduction, it has greatly restored the effect of offline concert viewing. Participated in the restoration work of this concert, and with the blessing of its algorithm, presented a sentimental audio-visual feast to the audience.

For Douyin and Volcano Engine, their mission is not to repair a concert and improve the clarity and recognition of the picture, but to evoke the memories behind generations of people through repair, so as to resonate and create sparks. is the value of repair.

After 31 years, the young Beyond is back

On major short video platforms, the re-screening of the restored version of the film is often brushed, which has also become one of the best ways for people to pay tribute to the classics, which often requires the blessing of AI technology.

AI repairing videos is not a new thing. With the update and iteration of technology, AI repairing concerts have become a new trend recently. From the perspective of repair type, although both belong to the category of video repair, they are quite different.

Shu Xiaofeng, researcher of the Volcano Engine Audio Technology Team, said that the content and production methods of films and concerts are quite different. Among them, the concert is output in the form of singing. The live environment is more complicated than the film environment. There are not only ambient sounds but also singing. There is also a certain distance between the pickup equipment, which makes concert restoration much more complicated than film restoration.

It’s worth noting that visuals are often one of the most important parts of a concert restoration, as it determines the look and feel of the overall concert.

Zhao Shijie, a researcher at the Volcano Engine Multimedia Laboratory, told Leifeng.com(Public number: Leifeng.com)the restoration of the image quality of the concert is not as simple as everyone thinks. Take the Beyond Live 1991 life contact concert as an example, because the overall environment of the concert was dark, the details were not rich enough, the scene switching was fast, and the lighting and sound environment were complicated. The work presents serious challenges that add difficulty to the algorithm.

According to Zhao Shijie, the resolution of the early video recording equipment is relatively poor, and the resolution of the pictures taken is low and the definition is poor, which often causes problems such as blurred pictures and lack of texture. The problem has a great impact on the aesthetics and integrity of the picture.

In addition, color and brightness are also a major difficulty in repairing. Due to the early shooting equipment, the film has a large number of overexposure and dead black scenes and a heavy background noise. How to avoid the background noise being amplified during the brightness adjustment process, and how to Reducing the discomfort caused by excess is the difficulty of the algorithm.

See also  "My Oni Girl": Redefining the Perception of Ghosts in Japanese Animation

In fact, in the image quality restoration process, the most difficult part is the face, which is also the point that the audience cares about the most. Faces in concerts often appear in multiple angles, such as frontal faces, side faces, looking down, looking up, etc., and even faces blocked by musical instruments, how to adjust different scenes, and the restoration effects of portraits in different poses are different from the background. This presents a severe test for the algorithm.

Facing these problems, Zhao Shijie explained to Leifeng.com how to deal with the enhancement algorithm of the volcano engine:

In terms of sharpness repair, based on deep learning algorithms trained on a large amount of data, the Volcano Engine changes video resolution from low-definition to ultra-high-definition through the ability of sharpness enhancement and defect repair methods, and generates richer textures in areas lacking details.

At the same time, for interlaced video images, Volcano Engine Multimedia Lab designed a neural network de-interlacing method for multi-frame input.

Zhao Shijie emphasized that this is mainly because most of the early videos were processed, encoded and displayed by interlaced scanning, which would cause serious flickering during direct playback of modern equipment. This is mainly because of the interlaced scanning signal, only one of the two lines has an image, and the other line is all black, so all need to undergo de-interlacing processing to convert the interlaced scanning signal into a progressive scanning signal.

In this case, the traditional de-interlacing method generally inputs only a single interlaced frame, and has a weak perception of content changes in the time domain, so it is poor in handling the drawing of motion scenes.

Compared with the traditional de-interlacing method, the multi-frame input neural network de-interleaving method designed by the volcano engine achieves technical effects that cannot be achieved by the traditional interleaving method. With its generalization characteristics, it repairs more details in the Beyond concert. And the drawing situation generated in sports scenes.

It can be seen from the before and after comparison effect that the hand of the musician originally had the problem of horizontal stripes, and the picture was rather blurred. After the restoration, the picture of both the piano and the human hand perfectly restored the realism and clarity of the scene. Spend.

Can AI repair sound and picture, but also memory?

In terms of brightness and color restoration, the Volcano Engine adopts an adaptive sub-regional color and brightness enhancement algorithm based on aesthetic scores for the fading, abnormal color, overexposure, and overdarkness of old videos in Beyond concerts.

The algorithm is mainly based on the specific film source effect, according to the aesthetic score, performs regional enhancement in color, brightness, contrast, saturation, portrait ROI and background, and performs adaptive brightness enhancement according to the color statistics of different frames, so that the Both bright and dark areas can get the best performance, and it also achieves a balance between repairing the fading of old video and retaining the retro feel of old video.

See also  Twenty-eight premature babies arrived in Egypt from Gaza. Over one million and 700 thousand displaced people in the Strip - Corriere TV

Can AI repair sound and picture, but also memory?

In terms of face restoration, the faces in many segments of the concert have problems such as large color noise and compression damage. In addition, it is understood that most of the face restoration in the industry is only based on a single photo, and it is difficult to find face restoration based on video, because the angle of the face in each frame in the video is different, and the faces in various poses need to be processed by algorithms .

The key breakthrough of the volcano engine, based on the deep learning model, uses the adaptive portrait enhancement algorithm in the restoration of the Beyond concert. Through the prior characteristics, it can also reconstruct the facial features while eliminating the blurring and compression damage of the face. Hair and other details and textures are reconstructed and added, so that the face restoration work in different scenes and different poses can make the face more clear.

Judging from the subjective experience of the overall concert, the restored picture is clearer and more realistic.

Some users have left a message in the comment area. The restored concert will have a live viewing feeling. The details of the faces of the singers and musicians and the stage of the musical instruments are clearly visible. There is a feeling of being in the Beyond Live 1991 life contact concert.

After a lapse of 31 years, Beyond’s singing continues to be “live”

Most people believe that high-definition sound quality can improve the integrity and sophistication of the entire concert, and it will also greatly increase the audience’s perception.

This is especially true for classic concerts like Beyond, which can trigger people’s feelings. If the restoration is good, it may be highly praised, and if it is not restored, it may be a tragedy.

During the restoration process of Beyond’s concert, the sound quality also brought varying degrees of difficulty and challenges to the restoration.

Shu Xiaofeng introduced the difficulties of sound quality repair to Leifeng.com:

  • First, the sound quality is poor due to the noise of the recording equipment and the environment, and the surrounding environment will record some noise floor;

  • Second, the problem of insufficient bandwidth caused by the low cut-off efficiency caused by the device compression algorithm;

  • Third, the excessive reverberation leads to poor sound quality and the problem of loudness brings hearing discomfort and other problems.

To this end, the Volcano Engine audio and video technology team has done targeted research and given corresponding solutions.

First of all, in the face of noise interference, Shu Xiaofeng said that at present, most of the industry uses traditional noise reduction methods, but its characteristics are mainly suitable for human voices. After music is processed by noise reduction algorithms, it will be damaged to varying degrees, and concerts are a human voice. , music sound, live ambient sound and other mixed sound environments, obviously the traditional noise reduction method is not suitable for the sound quality restoration of the concert.

With this in mind, Volcano Engine has developed an audio noise reduction algorithm for this multi-element scene. Different from the traditional noise reduction scheme, this algorithm is compatible with music scenes and vocal scenes, and can retain human voice and music. On the premise of suppressing other noises.

See also  Magic3 is released tonight! Honor global theme song "Go Beyond" officially launched_Pro

Secondly, bandwidth is also a key factor affecting the sound quality of a concert. Shu Xiaofeng told Leifeng.com that due to pickup equipment, recording hardware or compression and other reasons, high-frequency information will be damaged and the auditory experience will be affected. Based on this, the volcano engine expands the frequency band of the vocal part through the audio super-division algorithm, enriches the high-frequency information without harming the sound quality, and makes the sound from dull to clearer.

It can be seen from the spectrogram that the high frequency part of the original audio has been expanded and enhanced through the processing of the super division module.

Can AI repair sound and picture, but also memory?

Furthermore, concerts are often accompanied by different musical instruments, cheers from the audience, and various voices on the spot. The volcano engine extracts the singing part separately through the loudness algorithm, then adjusts it, and finally mixes it to make the vocals and other sounds. The volume ratio is more comfortable, thereby enhancing the audience’s listening experience.

It is worth noting that this restoration concert is composed of two parts: the 1991 Life Touch concert and the commemorative concert. Shu Xiaofeng said that this greatly increased the difficulty of repair. In order to avoid the difference in volume between the two parts, the loudness adjustment was made to the splicing part of the two concerts. In addition, the loudness adjustment was also made to the volume of different singers in the front and rear parts, which greatly improved auditory experience.

Through the re-screening of the Beyond concert, many viewers have given high praise, and many people said that not only the picture is clear, but also the sound quality has experienced offline concert-like listening.

Old video restoration, who is after Beyond?

AI repairing video has become a meaningful thing that many manufacturers are interested in and willing to spend time and energy to do in the past two years, and it has gradually become a new growth pole for many platforms.

In terms of video repair, in fact, the volcano engine has been in action as early as last year. In October last year, Xigua Video launched the “Classic Video 4K Restoration Plan” in conjunction with the Volcano Engine. In less than a year, it has completed the restoration of 100 classic cartoons through AI technology, including 4K restoration of 71 films, and the number of playbacks of childhood memory “Hulu Brothers” has reached 3 million.

The volcano engine can complete such high-quality restoration tasks, mainly due to the technical accumulation and successful practice in video cloud. At the same time, different scenes such as Douyin, Watermelon Video, and Today’s Toutiao also provide test fields for it.

According to reports, through the accumulation of technology, the volcano engine video cloud can be responsible for the four user experiences, including the four aspects of interaction, playback, picture quality and performance. With the blessing of excellent user experience, the audio and video technology of Volcano Engine has penetrated into industries such as games, e-commerce, education, and finance.

The restoration of old videos has just begun. From classic cartoons to concerts, to more classic image restoration, it still needs to be achieved through continuous technical iteration.

The original article of Leifeng.com is prohibited from reprinting without authorization. For details, see the instructions for reprinting.

You may also like

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More

Privacy & Cookies Policy