当前位置：网站首页>Pytorch builds CNN LSTM hybrid model to realize multivariable and multi step time series forecasting (load forecasting)

Pytorch builds CNN LSTM hybrid model to realize multivariable and multi step time series forecasting (load forecasting)

2022-06-26 06:56:00 【Cyril_ KI】

Catalog

I. Preface
II. CNN-LSTM
III. Code implementation
IV. Source code and data

I. Preface

About LSTM The specific principle of can refer to ： Artificial intelligence tutorial . except LSTM outside , This site also includes specific explanations of most other machine learning and deep learning models , Vivid pictures , Simple and easy to understand .

I have written many articles about time series prediction ：

All of the above articles use LSTM、ANN as well as CNN Three models are used to predict time series respectively . as everyone knows ,CNN The ability to extract features is very strong , So now many papers will CNN and LSTM Combined with time series prediction . This article will use PyTorch To build a simple CNN-LSTM The hybrid model realizes load forecasting .

II. CNN-LSTM

CNN-LSTM The model is built as follows ：

class CNN_LSTM(nn.Module):
    def __init__(self, args):
        super(CNN_LSTM, self).__init__()
        self.args = args
        self.relu = nn.ReLU(inplace=True)
        # (batch_size=30, seq_len=24, input_size=7) ---> permute(0, 2, 1)
        # (30, 7, 24)
        self.conv = nn.Sequential(
            nn.Conv1d(in_channels=args.in_channels, out_channels=args.out_channels, kernel_size=3),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size=3, stride=1)
        )
        # (batch_size=30, out_channels=32, seq_len-4=20) ---> permute(0, 2, 1)
        # (30, 20, 32)
        self.lstm = nn.LSTM(input_size=args.out_channels, hidden_size=args.hidden_size,
                            num_layers=args.num_layers, batch_first=True)
        self.fc = nn.Linear(args.hidden_size, args.output_size)

    def forward(self, x):
        x = x.permute(0, 2, 1)
        x = self.conv(x)
        x = x.permute(0, 2, 1)
        x, _ = self.lstm(x)
        x = self.fc(x)
        x = x[:, -1, :]

        return x

You can see , The CNN-LSTM By one-dimensional convolution +LSTM form .

adopt PyTorch build CNN Time series prediction is realized （ Wind speed prediction ） We know , The original definition of one-dimensional convolution is as follows ：

nn.Conv1d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)

In this paper, the definition of one-dimensional convolution of the model ：

nn.Conv1d(in_channels=args.in_channels, out_channels=args.out_channels, kernel_size=3)

here in_channels The concept of is equivalent to... In natural language processing embedding, Therefore, the number of input channels is 7, Indicates load + other 6 Environment variables ;out_channels Can be set at will , This article is set to 32;kernel_size Set to 3.

PyTorch The input size of one-dimensional convolution in is ：

input(batch_size, input_size, seq_len)=(30, 7, 24)

The data dimension obtained after data processing is ：

input(batch_size, seq_len, input_size)=(30, 24, 7)

therefore , We need to exchange dimensions ：

x = x.permute(0, 2, 1)

The input data after exchange will conform to CNN The input of .

The convolution operation in one-dimensional convolution is aimed at seq_len Dimensionally , That is to say (30, 7, 24) The last dimension in . therefore , after ：

nn.Conv1d(in_channels=args.in_channels, out_channels=args.out_channels, kernel_size=3)

after , The data dimension will become ：

(30, 32, 24-3+1)=(30, 32, 22)

First dimensional batch_size unchanged , The second dimension is input_size Will be made by in_channels=7 become out_channels=32, The third dimension is convoluted into 22.

After a maximum pool, it becomes ：

(30, 32, 22-3+1)=(30, 32, 20)

At this time (30, 32, 20) Will serve as a LSTM The input of . Because in LSTM We set up batch_first=True, therefore LSTM The input dimensions that can be received are ：

input(batch_size, seq_len, input_size)

The data dimension after convolution pooling is ：

input(batch_size=30, input_size=32, seq_len=20)

therefore , Dimension exchange is also required ：

x = x.permute(0, 2, 1)

Then there is the more conventional LSTM Input and output , No more details .

therefore , complete forward The function is shown below ：

def forward(self, x):
    x = x.permute(0, 2, 1)
    x = self.conv(x)
    x = x.permute(0, 2, 1)
    x, _ = self.lstm(x)
    x = self.fc(x)
    x = x[:, -1, :]

    return x

III. Code implementation

3.1 Data processing

According to the former 24 The load at one time and the environmental variables at that time are used to predict the next 4 The load of a moment , The direct multiple output strategy is adopted here , adjustment output_size The output step size can be adjusted .

Code implementation ：

def nn_seq(args):
    seq_len, B, num = args.seq_len, args.batch_size, args.output_size
    print('data processing...')
    dataset = load_data()
    # split
    train = dataset[:int(len(dataset) * 0.6)]
    val = dataset[int(len(dataset) * 0.6):int(len(dataset) * 0.8)]
    test = dataset[int(len(dataset) * 0.8):len(dataset)]
    m, n = np.max(train[train.columns[1]]), np.min(train[train.columns[1]])

    def process(data, batch_size, step_size):
        load = data[data.columns[1]]
        data = data.values.tolist()
        load = (load - n) / (m - n)
        load = load.tolist()
        seq = []
        for i in range(0, len(data) - seq_len - num, step_size):
            train_seq = []
            train_label = []

            for j in range(i, i + seq_len):
                x = [load[j]]
                for c in range(2, 8):
                    x.append(data[j][c])
                train_seq.append(x)

            for j in range(i + seq_len, i + seq_len + num):
                train_label.append(load[j])

            train_seq = torch.FloatTensor(train_seq)
            train_label = torch.FloatTensor(train_label).view(-1)
            seq.append((train_seq, train_label))

        # print(seq[-1])
        seq = MyDataset(seq)
        seq = DataLoader(dataset=seq, batch_size=batch_size, shuffle=False, num_workers=0, drop_last=False)

        return seq

    Dtr = process(train, B, step_size=1)
    Val = process(val, B, step_size=1)
    Dte = process(test, B, step_size=num)

    return Dtr, Val, Dte, m, n

3.2 model training / test

Same as before ：

def train(args, Dtr, Val, path):
    model = CNN_LSTM(args).to(args.device)
    loss_function = nn.MSELoss().to(args.device)
    optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
    print('training...')
    epochs = 50
    min_epochs = 10
    best_model = None
    min_val_loss = 5
    for epoch in range(epochs):
        train_loss = []
        for batch_idx, (seq, target) in enumerate(Dtr, 0):
            seq, target = seq.to(args.device), target.to(args.device)
            optimizer.zero_grad()
            y_pred = model(seq)
            loss = loss_function(y_pred, target)
            train_loss.append(loss.item())
            loss.backward()
            optimizer.step()

        # validation
        val_loss = get_val_loss(args, model, Val)
        if epoch + 1 >= min_epochs and val_loss < min_val_loss:
            min_val_loss = val_loss
            best_model = copy.deepcopy(model)

        print('epoch {:03d} train_loss {:.8f} val_loss {:.8f}'.format(epoch, np.mean(train_loss), val_loss))
        model.train()

    state = {
    'model': best_model.state_dict(), 'optimizer': optimizer.state_dict()}
    torch.save(state, path)


def test(args, Dte, path, m, n):
    print('loading model...')
    model = CNN_LSTM(args).to(args.device)
    model.load_state_dict(torch.load(path)['model'])
    model.eval()
    pred = []
    y = []
    for batch_idx, (seq, target) in enumerate(Dte, 0):
        seq = seq.to(args.device)
        with torch.no_grad():
            target = list(chain.from_iterable(target.tolist()))
            y.extend(target)
            y_pred = model(seq)
            y_pred = list(chain.from_iterable(y_pred.data.tolist()))
            pred.extend(y_pred)

    y, pred = np.array(y), np.array(pred)

    y = (m - n) * y + n
    pred = (m - n) * pred + n
    print('mape:', get_mape(y, pred))
    # plot
    x = [i for i in range(1, 151)]
    x_smooth = np.linspace(np.min(x), np.max(x), 900)
    y_smooth = make_interp_spline(x, y[150:300])(x_smooth)
    plt.plot(x_smooth, y_smooth, c='green', marker='*', ms=1, alpha=0.75, label='true')

    y_smooth = make_interp_spline(x, pred[150:300])(x_smooth)
    plt.plot(x_smooth, y_smooth, c='red', marker='o', ms=1, alpha=0.75, label='pred')
    plt.grid(axis='y')
    plt.legend()
    plt.show()