当前位置:网站首页>Free books! AI across the Internet paints old photos. Here is a detailed tutorial!
Free books! AI across the Internet paints old photos. Here is a detailed tutorial!
2022-06-25 15:42:00 【3D vision workshop】
0 One of the most exciting applications of deep learning is intelligent photo beautification , For example, coloring black and white images 、 Repair damaged pictures and remove blurring .
Take black and white image coloring as an example , By way of AI Combined with photo coloring , Even if you can't use Photoshop And other image editing tools , Coloring black and white photos can also be done with one click .
How does this work ? Now let me tell you !
1
Color space
When we load images , You'll get one 3 dimension ( Height 、 Width 、 Color channel ) Array , The data of the color channel represents RGB Colors in color space , Every pixel has 3 A digital , Indicates the red color of the pixel 、 Green and blue values .
In the figure 1 in , On the far left is the original image , On the right are red 、 Green and blue channels . When coloring a picture , According to the given black and white picture , It is necessary to determine the number of pixels at each position RGB What are the values , The color range is 0~255, That is, there is one for each pixel 256³ The prediction problem of .
chart 1
CIE1976L*a*b* Color space , yes 1976 International Lighting society (CIE) Recommended uniform color space . The space is a three-dimensional rectangular coordinate system , With brightness L* And chromaticity coordinates a*、b* To represent the position of color in color space .L* Indicates the lightness of the color , This channel is displayed as a black and white image ;a* A positive value indicates reddish , A negative value of green indicates that ;b* A positive value indicates yellowish , Negative values indicate bluish . chart 2 Express L*a*b* Each channel of the color space .
chart 2
Use L*a*b* Color space to color photos , Enter... For the shaded model L* passageway , Output the other two channels (a*,b*) The forecast , The choices are about 65000 individual , Far less than RGB Color space , So we can choose to use L*a*b* The color space data is used as the training data of the photo coloring model .
2
Generative antagonistic network
GAN(Generative Adversarial Networks, Generative antagonistic network ) It's a way to generate models .
GAN There are two models in the network structure :“ generator ” Models and “ Judging device ” Model ,“ generator ” Used to generate data ,“ Judging device ” Distinguish the authenticity of data .
stay GAN Model training , If you put “ generator ” As a painter who forges famous paintings , that “ Judging device ” He is a famous painting appraiser .
Initial stage “ generator ” Poor craftsmanship , Forged famous paintings can be easily “ Judging device ” Identified as a fake painting .“ generator ” Improve their own ability of counterfeiting according to the judgment basis . After a period of time “ Practice ”,“ generator ” Give the forged famous paintings to “ Judging device ”,“ Judging device ” Can't tell the true from the false , So learn more complex discrimination skills , Until the forgery of famous paintings can be identified .
Next ,“ generator ” and “ Judging device ” Repeat the above process , Carry on a new round of study .
" generator " and “ Judging device ” Is to play games with each other in a state of confrontation 、 Study 、 grow up , Until under the specified conditions “ Judging device ” Can't tell “ generator ” The authenticity of the generated data .
Use GAN Realize photo coloring , The structure of the model is shown in the figure 3 Shown .
chart 3
Use from COCO Data sets 8,000 Images for training , Each round of training lasts about 4 About minutes , after 100 After wheel , The generation effect is shown in the figure 4 Shown .
chart 4
The model can basically color some of the most common objects in the image , For example, the sky 、 Trees, etc , But you can't color rare objects . meanwhile , There are also some color spills and circular color blocks , The shading effect is not ideal .
therefore , We need to change our strategy !
3
Self attention generation countermeasure network
Before introducing a new solution, let's distinguish between two concepts : To color and recovery .
To color Strictly speaking, it is to change the photo from monochrome to believable color , Coloring is a “ Not bound ” The problem of , A lot of things ( For example, clothes ) There is no exact color . Therefore, coloring is a process of artistic creation , It is difficult for neural networks to be satisfactory .
recovery Is to replace the loss and loss in the picture , Make the picture as complete as new . The problem of fading is solved in the restoration without the original reference , Equivalent to coloring , They are all unrestrained artistic creations .
in summary , When evaluating shading and restoration effects , If people can't perceive that the image has been processed when they see the generated image , And feel happy from it , It is considered that the coloring and restoration work is completed .
So what is the new coloring strategy ?
“ generator ” with U-Net A similar structure U Shape neural network , Pictured 5 Shown .
chart 5
So “ generator ” Input grayscale , The left side extracts image features to recognize the content , The right side recognizes the result restoration according to the content and colors the result .
“ Judging device ” Use Critic Convolutional neural networks , It is convolution in the output layer rather than linear layer , It's big ( wide ), But it's simple . It inputs images , Output a fractional value , Indicates authenticity .
The most important of the new solutions is Self-Attention GAN( Self attention generation countermeasure network ) Application , Put the attention mechanism into “ generator ” and “ Judging device ” in .
Use the basics GAN The details of the generated image are not well controlled , The main reason is the use of convolutional neural network for image generation , Basically, they are based on local receptive fields , Focus on local feelings , Lack of global or other information , Therefore, only high-resolution details are generated in the form of dots in low resolution .
Pictured 6 Shown , There is a problem of uneven coloring of flowers , There are also wrong colors in other places .
chart 6
The ability of the self attention mechanism to simulate remote dependence 、 There is a better balance between computational efficiency and statistical efficiency . The self attention mechanism takes the weighted sum of features at all locations as the response of that location , And the weight ( Or notice the vector ) Calculate only at a small calculation cost .
Self-Attention GAN take Self-Attention The mechanism introduces convolution GAN, It can handle a long range well 、 Multi level dependency , When generating images, the details of each position and the details of the remote end are well coordinated ,“ Judging device ” It can also more accurately implement complex geometric constraints on the global image structure .
Here are some generated cases .
1) Audrey Hepburn · Hepburn
chart 7
2) Cyclists on the road in winter
chart 8
3) flowers
chart 9
4) The dog on the grass
chart -10
5) Jiangnan Water Town in China
chart 11
Although there are still some exceptions in some generated images , For example, Audrey · The color of the skin behind Hepburn's ears , But the overall effect has been very good , The performance of attention layer in color consistency and overall quality is a great surprise .
Except for auto coloring , Image super-resolution 、 Deblurring and so on GAN Important areas of concern .
Image super-resolution can generate high-resolution images from low-resolution images by up sampling , Image deblurring “ generator ” Used to generate clear images , The following is based in part on GAN To blur the case .
chart 12
chart 13
recently , By the founder of Dane Education 、 Chairman hanshaoyun , Vice president of technology research and development of Dane education group 、AI Jointly written by Zhengzheng, President of the Research Institute, etc 《 Application and practice of computer vision 》 launch , One of them is The above technologies are explained in detail .
This is Darnay education “ Artificial intelligence application and actual combat series ” The first book in the textbook , It is committed to helping readers quickly master the practical skills of computer vision , Add weight to high paying jobs .
《 Application and practice of computer vision 》 Mainly around computer vision in agriculture 、 Medical Science 、 Industry and other fields , Such as plant pest detection 、 Segmentation of fundus vascular image 、 Explain the wearing test of respirator , Combine theory with practice , Use lots of illustrations , Supplemented by examples , It can help readers quickly understand the basic principles and key technologies of several computer vision models and algorithms .
Besides ,《 Application and practice of computer vision 》 The key and difficult parts of the theoretical knowledge and practice in are interpreted by means of micro video , It can reduce the learning cost of readers , Effectively understand the core elements .
According to the 《 Artificial intelligence employment data atlas 》 According to the report , The artificial intelligence industry is still in the blue ocean of talent job competition , Hot jobs Top 100 in , The proportion of technical posts and non-technical posts is 6:4, For non AI related job seekers , There are still opportunities and space to enter the artificial intelligence industry .
So far , Darnay education has helped more than 100 Ten thousand students have successfully entered the well-known IT Internet enterprises take office . As a professional study and reference book ,《 Application and practice of computer vision 》 It is for beginners to understand the general knowledge of artificial intelligence 、 A medium for effectively mastering practical skills !
Scan the code and buy it quickly !
Free books !
Leave a message at the bottom of the article , You can take part in the activities
The top five fans with the highest praise for the message will each give one copy
《 Application and practice of computer vision 》
Mail home
Opening time :6 month 26 8:00 p.m
This article is only for academic sharing , If there is any infringement , Please contact to delete .
3D Recommended visual quality courses :
1. Multi sensor data fusion technology for automatic driving field
2. For the field of automatic driving 3D Whole stack learning route of point cloud target detection !( Single mode + Multimodal / data + Code )
3. Thoroughly understand the visual three-dimensional reconstruction : Principle analysis 、 Code explanation 、 Optimization and improvement
4. China's first point cloud processing course for industrial practice
5. laser - Vision -IMU-GPS The fusion SLAM Algorithm sorting and code explanation
6. Thoroughly understand the vision - inertia SLAM: be based on VINS-Fusion The class officially started
7. Thoroughly understand based on LOAM Framework of the 3D laser SLAM: Source code analysis to algorithm optimization
8. Thorough analysis of indoor 、 Outdoor laser SLAM Key algorithm principle 、 Code and actual combat (cartographer+LOAM +LIO-SAM)9. Build a set of structured light from zero 3D Rebuild the system [ theory + Source code + practice ]
10. Monocular depth estimation method : Algorithm sorting and code implementation 11. Deployment of deep learning model in autopilot 12. Camera model and calibration ( Monocular + Binocular + fisheye )13. blockbuster ! Four rotor aircraft : Algorithm and practice 14.ROS2 From entry to mastery : Theory and practice
blockbuster !3DCVer- Academic paper writing contribution The communication group has been set up
Scan the code to add a little assistant wechat , You can apply to join 3D Visual workshop - Academic paper writing and contribution WeChat ac group , The purpose is to communicate with each other 、 Top issue 、SCI、EI And so on .
At the same time, you can also apply to join our subdivision direction communication group , At present, there are mainly 3D Vision 、CV& Deep learning 、SLAM、 Three dimensional reconstruction 、 Point cloud post processing 、 Autopilot 、 Multi-sensor fusion 、CV introduction 、 Three dimensional measurement 、VR/AR、3D Face recognition 、 Medical imaging 、 defect detection 、 Pedestrian recognition 、 Target tracking 、 Visual products landing 、 The visual contest 、 License plate recognition 、 Hardware selection 、 Academic exchange 、 Job exchange 、ORB-SLAM Series source code exchange 、 Depth estimation and other wechat groups .
Be sure to note : Research direction + School / company + nickname , for example :”3D Vision + Shanghai Jiaotong University + quietly “. Please note... According to the format , Can be quickly passed and invited into the group . For original contributions, please contact .
▲ Long press and add wechat group or contribute
▲ The official account of long click attention
3D Vision goes from entry to mastery of knowledge : in the light of 3D Video courses in the field of vision ( 3D reconstruction series 、 3D point cloud series 、 Structured light series 、 Hand eye calibration 、 Camera calibration 、 laser / Vision SLAM、 Autopilot, etc )、 Summary of knowledge points 、 Introduction advanced learning route 、 newest paper Share 、 Answer questions from five aspects , There are also algorithm engineers from various large factories to provide technical guidance . meanwhile , The planet will be jointly released by well-known enterprises 3D Vision related algorithm development positions and project docking information , Create a set of technology and employment as one of the iron fans gathering area , near 4000 Planet members create better AI The world is making progress together , Knowledge planet portal :
Study 3D Visual core technology , Scan to see the introduction ,3 Unconditional refund within days
There are high quality tutorial materials in the circle 、 Answer questions and solve doubts 、 Help you solve problems efficiently
Feel useful , Please give me a compliment ~
边栏推荐
- VectorDraw Developer Framework 10.1001 Crack
- Go build reports an error missing go sum entry for module providing package ... to add:
- Detailed description of crontab command format and summary of common writing methods
- Solve valueerror: invalid literal for int() with base 10
- Sword finger offer II 091 Paint the house
- Solve the go project compilation error go mod: no such file or directory
- Super comprehensive custom deep copy function
- Kali SSH Remote Login
- Summary of four parameter adjustment methods for machine learning
- Is Guoxin golden sun reliable? Is it legal? Is it safe to open a stock account?
猜你喜欢
Several common optimization methods
[paper notes] mcunetv2: memory efficient patch based influence for tiny deep learning
Day01: learning notes
JVM memory region details
TFIDF与BM25
VectorDraw Developer Framework 10.1001 Crack
李飞飞团队将ViT用在机器人身上,规划推理最高提速512倍,还cue了何恺明的MAE
Yolov5 Lite: fewer parameters, higher accuracy and faster detection speed
MySQL performance optimization - index optimization
基于深度Q学习的雅达利打砖块游戏博弈
随机推荐
Error com mysql. cj. jdbc. exceptions. Communicationsexception: solutions to communications link failure
How GC determines whether an object can be recycled
Leetcode121 timing of buying and selling stocks
Source code analysis of nine routing strategies for distributed task scheduling platform XXL job
Record the time to read the file (the system cannot find the specified path)
Desktop development (Tauri) opens the first chapter
Do you want to go to an outsourcing company? This article will give you a comprehensive understanding of outsourcing pits!
Highly concurrent optimized Lua + openresty+redis +mysql (multi-level cache implementation) + current limit +canal synchronization solution
Golang regular regexp package uses -05- extend expand(), cut split() according to the rule
Is Guoxin golden sun reliable? Is it legal? Is it safe to open a stock account?
通过客户经理的开户链接开股票账户安全吗?
Summary of four parameter adjustment methods for machine learning
Data feature analysis skills - correlation test
Arthas, a sharp tool for online diagnosis - several important commands
MySQL performance optimization - index optimization
[paper notes] rethinking and improving relative position encoding for vision transformer
Sword finger offer 07 Rebuild binary tree
Mapbox map - inconsistent coordinate system when docking GIS layers?
JS notes
Kali modify IP address