当前位置:网站首页>Game optimization performance (11) - Zhihu
Game optimization performance (11) - Zhihu
2020-11-08 08:54:00 【osc_eoqljui5】
VS after , It's the rasterization stage . This stage is a fixed function ( Not programmable ) Stage , Usually considered to be highly efficient in execution , So it's often overlooked .
In fact, in terms of what I have observed , This part becomes the bottleneck situation , It's not uncommon . such as 《 Protogod 》 In the development process , That's what happened .
《 Protogod 》 So here's what happened , In the game, when characters climb trees , In order to avoid the canopy blocking the characters , There will be a translucent crown effect . Normal translucent rendering is a known performance killer , So here developers use stencil Cut out some pixels , It's called dither( shake ) Methods . If you don't understand this method , Imagine the pictures in the newspaper , It's all made up of dots .
Logically speaking , This hollowing out reduces the number of pixels that need to be rendered , That is to say PS The amount of work . But the development team found , The end result is a rise instead of a fall . In other words, rendering time has increased . And even more incredible is , By comparing the switching effect of GPU Tracking files , It can be observed that PS The amount of work is definitely reduced , But the rendering time has not changed, or even slightly longer .
In fact, the reason lies in the grating .VS The output triangle , After the grating module is rasterized , formation PS workload . Before rasterization , Will proceed according to the triangle level on the back / Positive elimination 、 Cone culling / tailoring , And zero area / Small triangle culling . however , be based on stencil Test level exclusion , It doesn't happen at the triangle level , It happened after rasterization fragment Level . in other words ,dither Although it reduces access to PS Stage fragment Number , But it doesn't affect the work of rasterization .
But if it's just that , that dither After opening , It should be faster . Because rasterization has the same amount of work , however PS Reduced workload , It should be faster . But the measurement is slower , Why is that ?
This is because on the contemporary desktop GPU among , Introduced tile-based rasterization. Note that this is not a mobile platform TB(D)R, Because it's limited to rasterization Stage .
say concretely ,GPU The unwrapping will not be rendered as triangles at one time fragment, It's at a lower resolution , such as 1/8 Target resolution , To rasterize . such as , If our picture turns out to be 1920x1080, be GPU First of all, with 240x145 This resolution is rasterized , And then for each rasterization result (8x8 Pixels ) Further rasterization .( The specific method and size are different GPU There may be significant differences in models )
There is one advantage to this approach , It can be greatly improved pre Z as well as pre Stencil The efficiency of . If a unit of low resolution (tile) On the whole pre Z Test or pre Stencil Rejected during the test , So there's no need to rasterize it more finely .
And the situation in our case is , Its use Stencil Templates , That is to say “ Hollowing out ” The template of , The pattern of the hole is not aligned with this tile. in other words , When we use tile Do it for the unit pre Stencil When , Can't refuse forever ( because tile The mask values are different , Partly through partial rejection ). In comparison, it doesn't open dither The situation of , It's like one more in vain stencil Testing, but the rasterization workload is not reduced at all , Instead, there is a query in the rasterization process stencil Steps for . So the efficiency of rasterization becomes lower .
版权声明
本文为[osc_eoqljui5]所创,转载请带上原文链接,感谢
边栏推荐
- [original] about the abnormal situation of high version poi autosizecolumn method
- Visual Studio 2015 未响应/已停止工作的问题解决
- Template linked list learning
- Oschina plays on Sunday - before that, I always thought I was a
- The most detailed usage guide for perconaxtradbcluster8.0
- Qt混合Python开发技术:Python介绍、混合过程和Demo
- 成功解决An error ocurred while starting the kernel
- 5g + Ar out of the circle, China Mobile Migu becomes the whole process strategic partner of the 33rd China Film Golden Rooster Award
- 2020-11-07:已知一个正整数数组,两个数相加等于N并且一定存在,如何找到两个数相乘最小的两个数?
- ts流中的pcr与pts计算与逆运算
猜你喜欢

C/C++编程笔记:C语言相比其他编程语言,有什么不一样的优势?

vivoy73s和荣耀30青春版的区别

Test requirements for MIC certification of Bluetooth 2.4G products in Japan

Template linked list learning

What is the difference between vivoy73s and vivoy70s

在Ubuntu上体验最新版本EROFS

Game mathematical derivation AC code (high precision and low precision multiplication and division comparison) + 60 code (long long) + 20 point code (Full Permutation + deep search DFS)

Face recognition: attack types and anti spoofing techniques

搜索引擎的日常挑战_4_外部异构资源 - 知乎

PerconaXtraDBCluster8.0 最详尽用法指南
随机推荐
How did Julia become popular?
Tiktok live monitoring Api: random recommendation
Do you really understand the high concurrency?
python_scrapy_房天下
Is there a big difference between i5 1135g7 and i51035g1? Which is better?
哔哩哔哩常用api
“智能5G”引领世界,数位智能网优+5G能带来什么?
Adobe Prelude /Pl 2020软件安装包(附安装教程)
laravel8更新之速率限制改进
Astra: Apache Cassandra的未来是云原生
iOS 学习笔记二【cocopods安装使用和安装过程中遇到的问题及解决办法】【20160725更新】
Windows subsystem Ubuntu installation
VC6兼容性及打开文件崩溃问题解决
【总结系列】互联网服务端技术体系:高性能之数据库索引
解决RabbitMQ消息丢失与重复消费问题
架构师(2020年11月)
Search and replace of sed
Littlest JupyterHub| 02 使用nbgitpuller分发共享文件
print( 'Hello,NumPy!' )
Shiyou's numerical analysis assignment