当前位置:网站首页>Free books! AI across the Internet paints old photos. Here is a detailed tutorial!

Free books! AI across the Internet paints old photos. Here is a detailed tutorial!

2022-06-25 15:42:00 3D vision workshop

0 One of the most exciting applications of deep learning is intelligent photo beautification , For example, coloring black and white images 、 Repair damaged pictures and remove blurring .

Take black and white image coloring as an example , By way of AI Combined with photo coloring , Even if you can't use Photoshop And other image editing tools , Coloring black and white photos can also be done with one click .

8280b52bd9db5436be2fffb29030bd49.gif

How does this work ? Now let me tell you !

1

Color space

When we load images , You'll get one 3 dimension ( Height 、 Width 、 Color channel ) Array , The data of the color channel represents RGB Colors in color space , Every pixel has 3 A digital , Indicates the red color of the pixel 、 Green and blue values .

In the figure 1 in , On the far left is the original image , On the right are red 、 Green and blue channels . When coloring a picture , According to the given black and white picture , It is necessary to determine the number of pixels at each position RGB What are the values , The color range is 0~255, That is, there is one for each pixel 256³  The prediction problem of .

55a2f11e65d5d7608cd5be55801e7cf8.png

chart 1

CIE1976L*a*b* Color space , yes 1976 International Lighting society (CIE) Recommended uniform color space . The space is a three-dimensional rectangular coordinate system , With brightness L* And chromaticity coordinates a*、b* To represent the position of color in color space .L* Indicates the lightness of the color , This channel is displayed as a black and white image ;a* A positive value indicates reddish , A negative value of green indicates that ;b* A positive value indicates yellowish , Negative values indicate bluish . chart 2 Express L*a*b* Each channel of the color space .

1a8af11ed2ba63e18b326af3fea03888.png

chart 2

Use L*a*b* Color space to color photos , Enter... For the shaded model L* passageway , Output the other two channels (a*,b*) The forecast , The choices are about 65000 individual , Far less than RGB Color space , So we can choose to use L*a*b* The color space data is used as the training data of the photo coloring model .

2

Generative antagonistic network

GAN(Generative Adversarial Networks, Generative antagonistic network ) It's a way to generate models .

GAN There are two models in the network structure :“ generator ” Models and “ Judging device ” Model ,“ generator ” Used to generate data ,“ Judging device ” Distinguish the authenticity of data .

stay GAN Model training , If you put “ generator ” As a painter who forges famous paintings , that “ Judging device ” He is a famous painting appraiser .

Initial stage “ generator ” Poor craftsmanship , Forged famous paintings can be easily “ Judging device ” Identified as a fake painting .“ generator ” Improve their own ability of counterfeiting according to the judgment basis . After a period of time “ Practice ”,“ generator ” Give the forged famous paintings to “ Judging device ”,“ Judging device ” Can't tell the true from the false , So learn more complex discrimination skills , Until the forgery of famous paintings can be identified .

Next ,“ generator ” and “ Judging device ” Repeat the above process , Carry on a new round of study .

" generator " and “ Judging device ” Is to play games with each other in a state of confrontation 、 Study 、 grow up , Until under the specified conditions “ Judging device ” Can't tell “ generator ” The authenticity of the generated data .

Use GAN Realize photo coloring , The structure of the model is shown in the figure 3 Shown .

eb3896506a8028abb17b46247bf6ce11.png84d8b0c64a2b426c21ba335b8b83bd61.png

chart 3

Use from COCO Data sets 8,000 Images for training , Each round of training lasts about 4 About minutes , after 100 After wheel , The generation effect is shown in the figure 4 Shown .

ad39f38a4e36bf348669207328242864.png

chart 4

The model can basically color some of the most common objects in the image , For example, the sky 、 Trees, etc , But you can't color rare objects . meanwhile , There are also some color spills and circular color blocks , The shading effect is not ideal .

therefore , We need to change our strategy !

3

Self attention generation countermeasure network

Before introducing a new solution, let's distinguish between two concepts : To color and recovery .

To color Strictly speaking, it is to change the photo from monochrome to believable color , Coloring is a “ Not bound ” The problem of , A lot of things ( For example, clothes ) There is no exact color . Therefore, coloring is a process of artistic creation , It is difficult for neural networks to be satisfactory .

recovery Is to replace the loss and loss in the picture , Make the picture as complete as new . The problem of fading is solved in the restoration without the original reference , Equivalent to coloring , They are all unrestrained artistic creations .

in summary , When evaluating shading and restoration effects , If people can't perceive that the image has been processed when they see the generated image , And feel happy from it , It is considered that the coloring and restoration work is completed .

So what is the new coloring strategy ?

“ generator ” with U-Net A similar structure U Shape neural network , Pictured 5 Shown .

17ee4c98bd6dd369f0f9ec8f2b26cc86.png

chart 5

So “ generator ” Input grayscale , The left side extracts image features to recognize the content , The right side recognizes the result restoration according to the content and colors the result .

“ Judging device ” Use Critic Convolutional neural networks , It is convolution in the output layer rather than linear layer , It's big ( wide ), But it's simple . It inputs images , Output a fractional value , Indicates authenticity .

The most important of the new solutions is Self-Attention GAN( Self attention generation countermeasure network ) Application , Put the attention mechanism into “ generator ” and “ Judging device ” in .

Use the basics GAN The details of the generated image are not well controlled , The main reason is the use of convolutional neural network for image generation , Basically, they are based on local receptive fields , Focus on local feelings , Lack of global or other information , Therefore, only high-resolution details are generated in the form of dots in low resolution .

Pictured 6 Shown , There is a problem of uneven coloring of flowers , There are also wrong colors in other places .

d4009be0c7ea39c151d0bee88d144bba.png

chart 6

The ability of the self attention mechanism to simulate remote dependence 、 There is a better balance between computational efficiency and statistical efficiency . The self attention mechanism takes the weighted sum of features at all locations as the response of that location , And the weight ( Or notice the vector ) Calculate only at a small calculation cost .

Self-Attention GAN take Self-Attention The mechanism introduces convolution GAN, It can handle a long range well 、 Multi level dependency , When generating images, the details of each position and the details of the remote end are well coordinated ,“ Judging device ” It can also more accurately implement complex geometric constraints on the global image structure .

Here are some generated cases .

1) Audrey Hepburn · Hepburn

bf825771441319cd62e06a67b3bfcb22.png

chart 7

2) Cyclists on the road in winter


afb08f03fdc6f9a1976e06d153589078.pnga4dff2eb218a82997ac17a71e44dd828.png chart 8

3) flowers

7b68f6b99c49a7f5d3a3ca769d548c00.png

0a764bec6c84b4afc4ca3a9234706028.png chart 9

4) The dog on the grass

6af43d0499d6d6e2189e5aaf68dd90f5.png

238deb2d88e1a84e8fb783c3cb8bb6e0.png chart -10

5) Jiangnan Water Town in China

b764d92bd1e30c9e61a8252f3215121b.png

f0dfa9028a2319d087ae0ce1d104bc98.png chart 11

Although there are still some exceptions in some generated images , For example, Audrey · The color of the skin behind Hepburn's ears , But the overall effect has been very good , The performance of attention layer in color consistency and overall quality is a great surprise .

Except for auto coloring , Image super-resolution 、 Deblurring and so on GAN Important areas of concern .

Image super-resolution can generate high-resolution images from low-resolution images by up sampling , Image deblurring “ generator ” Used to generate clear images , The following is based in part on GAN To blur the case .

02b53e81ed0c7e8d250cf75a81a6325c.png

chart 12

32774e5cdc608dfe24eeadeba14e48ab.png

chart 13

recently , By the founder of Dane Education 、 Chairman hanshaoyun , Vice president of technology research and development of Dane education group 、AI Jointly written by Zhengzheng, President of the Research Institute, etc 《 Application and practice of computer vision 》 launch , One of them is The above technologies are explained in detail .

This is Darnay education “ Artificial intelligence application and actual combat series ” The first book in the textbook , It is committed to helping readers quickly master the practical skills of computer vision , Add weight to high paying jobs .

77cccc79102f6767044673d9ed5569ec.png

《 Application and practice of computer vision 》 Mainly around computer vision in agriculture 、 Medical Science 、 Industry and other fields , Such as plant pest detection 、 Segmentation of fundus vascular image 、 Explain the wearing test of respirator , Combine theory with practice , Use lots of illustrations , Supplemented by examples , It can help readers quickly understand the basic principles and key technologies of several computer vision models and algorithms .

Besides ,《 Application and practice of computer vision 》 The key and difficult parts of the theoretical knowledge and practice in are interpreted by means of micro video , It can reduce the learning cost of readers , Effectively understand the core elements .

According to the 《 Artificial intelligence employment data atlas 》 According to the report , The artificial intelligence industry is still in the blue ocean of talent job competition , Hot jobs Top 100 in , The proportion of technical posts and non-technical posts is 6:4, For non AI related job seekers , There are still opportunities and space to enter the artificial intelligence industry .

So far , Darnay education has helped more than 100 Ten thousand students have successfully entered the well-known IT Internet enterprises take office . As a professional study and reference book ,《 Application and practice of computer vision 》 It is for beginners to understand the general knowledge of artificial intelligence 、 A medium for effectively mastering practical skills !

1e0281f6834d0c0d83f0bb264c2f77c6.png

Scan the code and buy it quickly !58cc86826798a4e43cfda505a4b642dd.png

 Free books !

 Leave a message at the bottom of the article , You can take part in the activities 
 The top five fans with the highest praise for the message will each give one copy 
《 Application and practice of computer vision 》
 Mail home 
 Opening time :6 month 26 8:00 p.m 



 This article is only for academic sharing , If there is any infringement , Please contact to delete .
3D Recommended visual quality courses :
1. Multi sensor data fusion technology for automatic driving field 
2. For the field of automatic driving 3D Whole stack learning route of point cloud target detection !( Single mode + Multimodal / data + Code )
3. Thoroughly understand the visual three-dimensional reconstruction : Principle analysis 、 Code explanation 、 Optimization and improvement 
4. China's first point cloud processing course for industrial practice 
5. laser - Vision -IMU-GPS The fusion SLAM Algorithm sorting and code explanation 
6. Thoroughly understand the vision - inertia SLAM: be based on VINS-Fusion The class officially started 
7. Thoroughly understand based on LOAM Framework of the 3D laser SLAM:  Source code analysis to algorithm optimization 
8. Thorough analysis of indoor 、 Outdoor laser SLAM Key algorithm principle 、 Code and actual combat (cartographer+LOAM +LIO-SAM)9. Build a set of structured light from zero 3D Rebuild the system [ theory + Source code + practice ]
10. Monocular depth estimation method : Algorithm sorting and code implementation 11. Deployment of deep learning model in autopilot 12. Camera model and calibration ( Monocular + Binocular + fisheye )13. blockbuster ! Four rotor aircraft : Algorithm and practice 14.ROS2 From entry to mastery : Theory and practice 
 blockbuster !3DCVer- Academic paper writing contribution   The communication group has been set up 
 Scan the code to add a little assistant wechat , You can apply to join 3D Visual workshop - Academic paper writing and contribution   WeChat ac group , The purpose is to communicate with each other 、 Top issue 、SCI、EI And so on .


 At the same time, you can also apply to join our subdivision direction communication group , At present, there are mainly 3D Vision 、CV& Deep learning 、SLAM、 Three dimensional reconstruction 、 Point cloud post processing 、 Autopilot 、 Multi-sensor fusion 、CV introduction 、 Three dimensional measurement 、VR/AR、3D Face recognition 、 Medical imaging 、 defect detection 、 Pedestrian recognition 、 Target tracking 、 Visual products landing 、 The visual contest 、 License plate recognition 、 Hardware selection 、 Academic exchange 、 Job exchange 、ORB-SLAM Series source code exchange 、 Depth estimation and other wechat groups .
 Be sure to note : Research direction + School / company + nickname , for example :”3D Vision  +  Shanghai Jiaotong University  +  quietly “. Please note... According to the format , Can be quickly passed and invited into the group . For original contributions, please contact .

▲ Long press and add wechat group or contribute 

▲ The official account of long click attention 

3D Vision goes from entry to mastery of knowledge : in the light of 3D Video courses in the field of vision ( 3D reconstruction series 、 3D point cloud series 、 Structured light series 、 Hand eye calibration 、 Camera calibration 、 laser / Vision SLAM、 Autopilot, etc )、 Summary of knowledge points 、 Introduction advanced learning route 、 newest paper Share 、 Answer questions from five aspects , There are also algorithm engineers from various large factories to provide technical guidance . meanwhile , The planet will be jointly released by well-known enterprises 3D Vision related algorithm development positions and project docking information , Create a set of technology and employment as one of the iron fans gathering area , near 4000 Planet members create better AI The world is making progress together , Knowledge planet portal :
 Study 3D Visual core technology , Scan to see the introduction ,3 Unconditional refund within days 

  There are high quality tutorial materials in the circle 、 Answer questions and solve doubts 、 Help you solve problems efficiently 
 Feel useful , Please give me a compliment ~
原网站

版权声明
本文为[3D vision workshop]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/176/202206251459573387.html