当前位置:网站首页>Pytorch neural network
Pytorch neural network
2022-06-26 08:54:00 【Thick Cub with thorns】
pytorch Deep learning
RNN Cyclic neural network pytorch
RNN
The latter neural network will be based on the contribution of the former neural network
A wider range of time series structure inputs can be accepted
LSTM RNN
long short-term memory( Long and short term memory )
Ordinary rnn The initial information will be ignored , Reduce the initial information during back propagation .
And cause The gradient disappears , Also called gradient dispersion
It is also possible to create infinity after the initial gradient change , be called Gradient explosion
therefore , Ordinary rnn Unable to solve the problem of pivot point memory
lstm rnn More input in , Output , Forget the controller
According to the importance of input and output , Join the recurrent neural network
pytorch Realization
Classification problem
import torch
from torch import nn
import torchvision.datasets as dsets
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
# torch.manual_seed(1) # reproducible
# Hyper Parameters
EPOCH = 1 # train the training data n times, to save time, we just train 1 epoch
BATCH_SIZE = 64
TIME_STEP = 28 # rnn time step / image height
INPUT_SIZE = 28 # rnn input size / image width
LR = 0.01 # learning rate
DOWNLOAD_MNIST = True # set to True if haven't download the data
# Mnist digital dataset
train_data = dsets.MNIST(
root='./mnist/',
train=True, # this is training data
transform=transforms.ToTensor(), # Converts a PIL.Image or numpy.ndarray to
# torch.FloatTensor of shape (C x H x W) and normalize in the range [0.0, 1.0]
download=DOWNLOAD_MNIST, # download it if you don't have it
)
# plot one example
print(train_data.train_data.size()) # (60000, 28, 28)
print(train_data.train_labels.size()) # (60000)
plt.imshow(train_data.train_data[0].numpy(), cmap='gray')
plt.title('%i' % train_data.train_labels[0])
plt.show()
# Data Loader for easy mini-batch return in training
train_loader = torch.utils.data.DataLoader(dataset=train_data, batch_size=BATCH_SIZE, shuffle=True)
# convert test data into Variable, pick 2000 samples to speed up testing
test_data = dsets.MNIST(root='./mnist/', train=False, transform=transforms.ToTensor())
test_x = test_data.test_data.type(torch.FloatTensor)[:2000]/255. # shape (2000, 28, 28) value in range(0,1)
test_y = test_data.test_labels.numpy()[:2000] # covert to numpy array
class RNN(nn.Module):
def __init__(self):
super(RNN, self).__init__()
self.rnn = nn.LSTM( # if use nn.RNN(), it hardly learns
input_size=INPUT_SIZE,
hidden_size=64, # rnn hidden unit
num_layers=1, # number of rnn layer
batch_first=True, # input & output will has batch size as 1s dimension. e.g. (batch, time_step, input_size)
)
self.out = nn.Linear(64, 10)
def forward(self, x):
# x shape (batch, time_step, input_size)
# r_out shape (batch, time_step, output_size)
# h_n shape (n_layers, batch, hidden_size)
# h_c shape (n_layers, batch, hidden_size)
r_out, (h_n, h_c) = self.rnn(x, None) # None represents zero initial hidden state
# choose r_out at the last time step
out = self.out(r_out[:, -1, :])
return out
rnn = RNN()
print(rnn)
out
RNN(
(rnn): LSTM(28, 64, batch_first=True)
(out): Linear(in_features=64, out_features=10, bias=True)
)
Achieve optimization 、 Training
optimizer = torch.optim.Adam(rnn.parameters(), lr=LR) # optimize all cnn parameters
loss_func = nn.CrossEntropyLoss() # the target label is not one-hotted
# training and testing
for epoch in range(EPOCH):
for step, (b_x, b_y) in enumerate(train_loader): # gives batch data
b_x = b_x.view(-1, 28, 28) # reshape x to (batch, time_step, input_size)
output = rnn(b_x) # rnn output
loss = loss_func(output, b_y) # cross entropy loss
optimizer.zero_grad() # clear gradients for this training step
loss.backward() # backpropagation, compute gradients
optimizer.step() # apply gradients
if step % 50 == 0:
test_output = rnn(test_x) # (samples, time_step, input_size)
pred_y = torch.max(test_output, 1)[1].data.numpy()
accuracy = float((pred_y == test_y).astype(int).sum()) / float(test_y.size)
print('Epoch: ', epoch, '| train loss: %.4f' % loss.data.numpy(), '| test accuracy: %.2f' % accuracy)
# print 10 predictions from test data
test_output = rnn(test_x[:10].view(-1, 28, 28))
pred_y = torch.max(test_output, 1)[1].data.numpy()
print(pred_y, 'prediction number')
print(test_y[:10], 'real number')
out
Epoch: 0 | train loss: 2.2896 | test accuracy: 0.12
Epoch: 0 | train loss: 0.8098 | test accuracy: 0.60
Epoch: 0 | train loss: 0.6983 | test accuracy: 0.73
Epoch: 0 | train loss: 0.5486 | test accuracy: 0.81
Epoch: 0 | train loss: 0.7209 | test accuracy: 0.85
Epoch: 0 | train loss: 0.2399 | test accuracy: 0.87
Epoch: 0 | train loss: 0.4179 | test accuracy: 0.90
Epoch: 0 | train loss: 0.5278 | test accuracy: 0.88
Epoch: 0 | train loss: 0.3201 | test accuracy: 0.90
Epoch: 0 | train loss: 0.1950 | test accuracy: 0.92
Epoch: 0 | train loss: 0.2301 | test accuracy: 0.92
Epoch: 0 | train loss: 0.1683 | test accuracy: 0.94
Epoch: 0 | train loss: 0.1188 | test accuracy: 0.93
Epoch: 0 | train loss: 0.0566 | test accuracy: 0.95
Epoch: 0 | train loss: 0.0941 | test accuracy: 0.94
Epoch: 0 | train loss: 0.3501 | test accuracy: 0.95
Epoch: 0 | train loss: 0.0342 | test accuracy: 0.93
Epoch: 0 | train loss: 0.0753 | test accuracy: 0.96
Epoch: 0 | train loss: 0.1507 | test accuracy: 0.96
[7 2 1 0 4 1 4 9 6 9] prediction number
[7 2 1 0 4 1 4 9 5 9] real number
The return question
import torch
from torch import nn
import numpy as np
import matplotlib.pyplot as plt
# torch.manual_seed(1) # reproducible
# Hyper Parameters
TIME_STEP = 10 # rnn time step
INPUT_SIZE = 1 # rnn input size
LR = 0.02 # learning rate
# show data
steps = np.linspace(0, np.pi * 2, 100, dtype=np.float32) # float32 for converting torch FloatTensor
x_np = np.sin(steps)
y_np = np.cos(steps)
plt.plot(steps, y_np, 'r-', label='target (cos)')
plt.plot(steps, x_np, 'b-', label='input (sin)')
plt.legend(loc='best')
plt.show()
class RNN(nn.Module):
def __init__(self):
super(RNN, self).__init__()
self.rnn = nn.RNN(
input_size=INPUT_SIZE,
hidden_size=32, # rnn hidden unit
num_layers=1, # number of rnn layer
batch_first=True, # input & output will has batch size as 1s dimension. e.g. (batch, time_step, input_size)
)
self.out = nn.Linear(32, 1)
def forward(self, x, h_state):
# x (batch, time_step, input_size)
# h_state (n_layers, batch, hidden_size)
# r_out (batch, time_step, hidden_size)
r_out, h_state = self.rnn(x, h_state)
outs = [] # save all predictions
for time_step in range(r_out.size(1)): # calculate output for each time step
outs.append(self.out(r_out[:, time_step, :]))
return torch.stack(outs, dim=1), h_state
# instead, for simplicity, you can replace above codes by follows
# r_out = r_out.view(-1, 32)
# outs = self.out(r_out)
# outs = outs.view(-1, TIME_STEP, 1)
# return outs, h_state
# or even simpler, since nn.Linear can accept inputs of any dimension
# and returns outputs with same dimension except for the last
# outs = self.out(r_out)
# return outs
rnn = RNN()
print(rnn)
optimizer = torch.optim.Adam(rnn.parameters(), lr=LR) # optimize all cnn parameters
loss_func = nn.MSELoss()
h_state = None # for initial hidden state
plt.figure(1, figsize=(12, 5))
plt.ion() # continuously plot
for step in range(100):
start, end = step * np.pi, (step + 1) * np.pi # time range
# use sin predicts cos
steps = np.linspace(start, end, TIME_STEP, dtype=np.float32,
endpoint=False) # float32 for converting torch FloatTensor
x_np = np.sin(steps)
y_np = np.cos(steps)
x = torch.from_numpy(x_np[np.newaxis, :, np.newaxis]) # shape (batch, time_step, input_size)
y = torch.from_numpy(y_np[np.newaxis, :, np.newaxis])
prediction, h_state = rnn(x, h_state) # rnn output
# !! next step is important !!
h_state = h_state.data # repack the hidden state, break the connection from last iteration
loss = loss_func(prediction, y) # calculate loss
optimizer.zero_grad() # clear gradients for this training step
loss.backward() # backpropagation, compute gradients
optimizer.step() # apply gradients
# plotting
plt.plot(steps, y_np.flatten(), 'r-')
plt.plot(steps, prediction.data.numpy().flatten(), 'b-')
plt.draw();
plt.pause(0.05)
plt.ioff()
plt.show()
out
RNN(
(rnn): RNN(1, 32, batch_first=True)
(out): Linear(in_features=32, out_features=1, bias=True)
)
AutoEncoder( Self coding )
First compress the original data , Decompress to get the output
Then the output is optimized by reverse transmission
It's a kind of Unsupervised learning , More than the PCA
After compression, the encoder is obtained , Master the essence of the original data
Reinforcement learning
- Deep Q Network(DQN)
- GAN( Meaningless random number generation , Improve each other )
- generator Generate the data ,discriminator To judge
torch Is dynamic
May adopt GPU Speed up
Ease of overfitting (Over fitting)
Add one more drop layer
net_dropped = torch.nn.Sequential(
torch.nn.Linear(1, N_HIDDEN),
torch.nn.Dropout(0.5), # Then half of the points are shielded , Achieve mitigation overfitting
torch.nn.ReLU(),
)
Batch of standardized (Batch Normalization)
The excitation function is insensitive to large numbers
This is not just at the input layer , Also occurs in hidden layers
Batch standardization is between the excitation function and the next layer
It is divided into standard chemical engineering sequence , Reverse standardization process
def __init__(self, batch_normalization=False):
super(Net, self).__init__()
self.do_bn = batch_normalization
self.fns = []
self.bns = []
self.bn_input = nn.BatchNormal1d(1, momentum=0.5)
for i in range(N_HIDDEN):
input_size = 1 if i == 0 else 10
fc = nn.Linear(input_size, 10)
setattr(self, 'fc%i' % i, fc) # important
self._set_init()
self.predict = nn.Linear(10, 1)
self._set_init(self.predict)
边栏推荐
- ROS learning notes (5) -- Exploration of customized messages
- OpenGL display mat image
- 鲸会务为活动现场提供数字化升级方案
- Speckle denoising method for ultrasonic image
- Two ways to realize time format printing
- Structure diagram of target detection network
- Degree of freedom analysis_ nanyangjx
- 关于极客时间 | MySQL实战45讲的部分总结
- Design based on STM32 works: multi-functional atmosphere lamp, wireless control ws2812 of mobile app, MCU wireless upgrade program
- Segmentation of structured light images using segmentation network
猜你喜欢

Nebula diagram_ Object detection and measurement_ nanyangjx

深度学习论文阅读目标检测篇(七)中文版:YOLOv4《Optimal Speed and Accuracy of Object Detection》

Recovering the system with Clonezilla USB disk

Yolov5进阶之二安装labelImg

Design based on STM32 works: multi-functional atmosphere lamp, wireless control ws2812 of mobile app, MCU wireless upgrade program

Yolov5进阶之三训练环境

Remote centralized control of distributed sensor signals using wireless technology

First character that appears only once

Clion installation + MinGW configuration + opencv installation

Embedded Software Engineer (6-15k) written examination interview experience sharing (fresh graduates)
随机推荐
9. code generation
Ltp-- extract time, person and place
opencv学习笔记二
1.23 neural network
【云原生 | Kubernetes篇】深入万物基础-容器(五)
鲸会务一站式智能会议系统帮助主办方实现数字化会议管理
ROS learning notes (5) -- Exploration of customized messages
唯品会工作实践 : Json的deserialization应用
Reverse crawling verification code identification login (OCR character recognition)
Pandas vs. SQL 1_ nanyangjx
Line detection_ nanyangjx
The best time to buy and sell stocks to get the maximum return
Relation extraction model -- spit model
pgsql_ UDF01_ jx
VS2005 project call free() compiled with static libcurl library reported heap error
torch. fft
Batch execute SQL file
Slider verification - personal test (JD)
static const与static constexpr的类内数据成员初始化
Installation of jupyter