当前位置：网站首页>Practice torch FX: pytorch based model optimization quantization artifact

Practice torch FX: pytorch based model optimization quantization artifact

2022-06-27 23:19:00 【Zhiyuan community】

Let's talk about it today More important Of torch.fx, Also take advantage of this opportunity to put the previous torch.fx Take notes , The notes are roughly split into three , Corresponding to three articles respectively ：

What is? fx
be based on fx Quantify
be based on fx Quantitative deployment to TensorRT

This article corresponds to the first , This paper mainly introduces torch.fx And basic usage . I don't say much nonsense , Let's get started. ！

What is? Torch.FX

torch.fx yes Pytorch 1.8 A set of tools or a library , Is to do python-to-python code transformation, The general idea is that you can put pytorch Medium python Forward code Change to what you want , The official introduction is as follows ：

We apply this principle in torch.fx, a program capture and transformation library for PyTorch written entirely in Python and optimized for high developer productivity by ML practitioners

The above comes from FX The paper of , Those who are interested can read it TORCH.FX: PRACTICAL PROGRAM CAPTURE AND TRANSFORMATION FOR DEEP LEARNING IN PYTHON^[1] This article , I know there is also a good article on Reading ^[2], I won't repeat it here . However, this article will also introduce the contents of the paper , More from a practical point of view .

The key words are program capture and transformation library, These two concepts are very important .

that FX How to do? ？ Let's see , We define a pytorch.nn.module：

class MyModule(torch.nn.Module):    def __init__(self):        super().__init__()        self.param = torch.nn.Parameter(torch.rand(3, 4))        self.linear = torch.nn.Linear(4, 5)    def forward(self, x):        return self.linear(x + self.param).clamp(min=0.0, max=1.0)

Very simply Inherited from torch.nn.Module Of Module（ be familiar with pytorch We should all know ）. Among them, forward forward The function also records this module Specific operation logic .

If we want to put this Module in forward Part of the operation logic in self.linear(x + self.param).clamp(min=0.0, max=1.0) Of clamp Part replaced with sigmoid, What should I do ？

Of course, you can change the code directly , But if these operations are many , Or you wrote a lot of modules , Or you want to do a lot of experiments （ Some modules are changed, some modules are not changed ）, This is more cumbersome .

It's time to FX, We don't need to modify the code manually （ Just change this yourself forward Realization ）, Just set the rules , Use torch.fx, Bring in this model instance , Run the code . This is yours MyModule in forward Part will become self.linear(x + self.param).sigmoid()：

module = MyModule()from torch.fx import symbolic_trace# Symbolic tracing frontend - captures the semantics of the modulesymbolic_traced : torch.fx.GraphModule = symbolic_trace(module)# High-level intermediate representation (IR) - Graph representation#  Print view FX Of IRprint(symbolic_traced.graph)"""graph():    %x : [#users=1] = placeholder[target=x]    %param : [#users=1] = get_attr[target=param]    %add : [#users=1] = call_function[target=operator.add](args = (%x, %param "#users=1] = call_function[target=operator.add"), kwargs = {})    %linear : [#users=1] = call_module[target=linear](args = (%add, "#users=1] = call_module[target=linear"), kwargs = {})    %clamp : [#users=1] = call_method[target=clamp](args = (%linear, "#users=1] = call_method[target=clamp"), kwargs = {min: 0.0, max: 1.0})    return clamp"""# Code generation - valid Python code#  adopt FX Generated code , Can be regarded as module Medium forward Code print(symbolic_traced.code)"""def forward(self, x):    param = self.param    add = x + param;  x = param = None    linear = self.linear(add);  add = None    clamp = linear.clamp(min = 0.0, max = 1.0);  linear = None    return clamp"""

such ,FX Will help you modify This Module, And modified this model Just use it as usual , Note that there ,FX capture What you wrote forward Code , Then it went on transform, Modified the operation .

Of course, it's just It's simple. It's simple Of fx A function of , We can also go through fx：

Merge two op, such as conv and bn
Remove some op
Replace some op
In some op After inserting some op Or other operations

Etc., etc. .

You may wonder , These operations are not very image AI In the compiler PASS, The operation object is also the neural network DAG（ Directed acyclic graph ）. In fact! ,FX You can also understand it as a compiler , But the compiler eventually produces an executable , It is python->python, The final product is based on Pytorch Regular python Code , That's why FX Always say you are Python-to-Python (or Module-to-Module) transformation toolkit instead of compiler 了 .

FX At present, most of them API It's stable （ stay torch-1.10 Officially released in ）, It has little historical burden to use .

fx Official introduction of ：

https://pytorch.org/docs/stable/fx.html

torch.fx Relationship with quantification

FX The first positive is based on Pytorch Quantitative tools , This is also my introduction FX One of the reasons . With the help of FX It's easy to pytorch Quantitative operation of the model , Before Shang Tang came up with an idea based on fx Quantitative tools MQBench^[3].

For quantification , Whether it's PTQ（ Need to insert observation op To collect the activation distribution and weight distribution of each layer ） still QTA（ Need to insert fake Quantization node to simulate quantization ）, It's all about fx The function of . So if you want to be based on Pytorch Framework to quantify , It is suggested to start directly torch.fx.

fx stay pytorch-1.10 Already in stable state , Most of the API It's stabilized , I'll take... Too torch.fx Several models are quantified , Finally get TensorRT On , It involves convolution 、BN、 deconvolution 、add、concat And so on , The version used is Pytorch-1.10 and TensorRT-8.2.

among fx Some of them have modified the source code , Added some op. Here I just put the latest release Of pytorch Medium fx Take out part of it , then pip install torch-1.10.0+cu113-cp38-cp38-linux_x86_64.whl, Serve with both .

And TorchScript The difference between

In fact, at the beginning torch.fx When I appeared, I also thought about the difference between the two , They all analyze the model first 、 Then generate IR、 Then based on IR Do some optimization , And then finally generate a The final version of the optimized model , Is one python One of the versions is C++ edition ？ It's certainly not that simple . When you FX It's too much , Will find FX and torchscript The positioning of is different ,FX More focus on the model Some functionality Changes （ For example, batch increase 、 Modify an action , For example, add statistical operations , Like quantification ）; and torchscript Focus more on Optimize the performance of the current model , And you can Out of the python, Only in C++ Environment is running .

Borrow a reply from the official boss ：

torch.fx is different from TorchScript in that it is a platform for Python-to-Python transformations of PyTorch code. TorchScript, on the other hand, is more targeted at moving PyTorch programs outside of Python for deployment purposes. In this sense, FX and TorchScript are orthogonal to each other, and can even be composed with each other (e.g. transform PyTorch programs with FX, then subsequently export to TorchScript for deployment).

The main idea is ,FX Just do Python2Python Transformation , Unlike Torchscript The same is for deployment ( Out of the Python This environment , stay C++ Run in ) And make a conversion . It doesn't matter , No conflict , use FX The converted model can also be used torchscript Keep switching , The two are orthogonal .

Python to Python?

But here's the thing ,FX The code generation formula of Python To Python. in other words ,FX Generated code , And our usual use nn.Module It makes no difference to build a network , You can use it directly Pytorch Of eager mode run , Unlike torchscript equally , It's another set runtime（ Let's run torchscript In fact, what is called is a VM, That's virtual machine , adopt VM stay C++ Run through torchscript The exported model ）.

therefore fx The converted model type and nn.Module 10 Fen is the same , So for nn.Module What can be done , The converted model can also be , We can set dolls continuously ：

His writing Module -> fx After all Module -> continuity fx change -> To get the final fx Model

FX Of IR and Jit Of IR

These two IR Dissimilarity ,FX Of IR Comparison Jit For the , There are two advantages ：

FX Closely integrated into Python Of runtime in , because FX Can capture more accurately prograim representations, Unlike trace Sometimes it goes wrong .
FX Of Graph and nn.module Do not have what difference , Its IR Not so low , So it's easier to use , Efficiency will also improve .

Here is a brief list of FX Of IR, It's simple , Only six , Probably the function is It's a modulation function 、 extract attr、 Get input and output etc. :

placeholderrepresents a function input. The name attribute specifies the name this value will take on. target is similarly the name of the argument. args holds either: 1) nothing, or 2) a single argument denoting the default parameter of the function input. kwargs is don't-care. Placeholders correspond to the function parameters (e.g. x) in the graph printout.
get_attrretrieves a parameter from the module hierarchy. name is similarly the name the result of the fetch is assigned to. target is the fully-qualified name of the parameter's position in the module hierarchy. args and kwargs are don't-care
call_functionapplies a free function to some values. name is similarly the name of the value to assign to. target is the function to be applied. args and kwargs represent the arguments to the function, following the Python calling convention
call_moduleapplies a module in the module hierarchy's forward() method to given arguments. name is as previous. target is the fully-qualified name of the module in the module hierarchy to call. args and kwargs represent the arguments to invoke the module on, including the self argument.
call_methodcalls a method on a value. name is as similar. target is the string name of the method to apply to the self args and kwargs represent the arguments to invoke the module on, including the self argument
outputcontains the output of the traced function in its args[0] This corresponds to the "return" statement in the Graph printout.

comparison torchscript Of IR,FX It's much simpler , It's easy for us to understand and use .

原网站

版权声明
本文为[Zhiyuan community]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/178/202206272036172683.html