5. CNN-LSTMΒΆ

Open In Colab

이전 4μž₯μ—μ„œλŠ” LSTM을 ν™œμš©ν•˜μ—¬ λŒ€ν•œλ―Όκ΅­ μ½”λ‘œλ‚˜19 ν™•μ§„μž 수λ₯Ό μ˜ˆμΈ‘ν•΄λ³΄μ•˜μŠ΅λ‹ˆλ‹€. LSTM은 Hochreiter & Schmidhuber (1997)에 μ˜ν•΄ 처음 μ†Œκ°œλ˜μ—ˆκ³ , 이후 지속적인 연ꡬ와 ν•¨κ»˜ λ°œμ „ν•΄μ˜€κ³  μžˆμŠ΅λ‹ˆλ‹€.

이번 μž₯μ—μ„œλŠ” λͺ¨λΈ μ„±λŠ₯ ν–₯상을 μœ„ν•œ λ‹€λ₯Έ 방법을 μ‹€ν—˜ν•΄λ³΄λ„λ‘ ν•˜κ² μŠ΅λ‹ˆλ‹€. λͺ¨λΈ μ„±λŠ₯ ν–₯상을 μœ„ν•΄μ„œλŠ” 배치 (batch) μ‚¬μ΄μ¦ˆμ™€ 에폭 (epoch) 수 λ³€κ²½, 데이터셋 νλ ˆμ΄νŒ…(Dataset Curating), 데이터셋 λΉ„μœ¨ μ‘°μ •, 손싀 ν•¨μˆ˜(Loss Function) λ³€κ²½, λͺ¨λΈ λ³€κ²½ λ“±μ˜ 방법이 μžˆμ§€λ§Œ, 이번 μ‹€μŠ΅μ—μ„œλŠ” λͺ¨λΈ ꡬ쑰 변경을 톡해 μ„±λŠ₯을 ν–₯상을 μ§„ν–‰ν•΄λ³΄κ² μŠ΅λ‹ˆλ‹€. CNN-LSTM λͺ¨λΈμ„ μ‚¬μš©ν•˜μ—¬, λŒ€ν•œλ―Όκ΅­ μ½”λ‘œλ‚˜19 ν™•μ§„μž 수 μ˜ˆμΈ‘μ— μžˆμ–΄μ„œ 더 λ‚˜μ€ μ„±λŠ₯을 보일 수 μžˆλŠ”μ§€ μ‚΄νŽ΄λ³΄λ„λ‘ ν•˜κ² μŠ΅λ‹ˆλ‹€.

κ°€μž₯ λ¨Όμ € 이번 μž₯에 ν•„μš”ν•œ λΌμ΄λΈŒλŸ¬λ¦¬λ“€μ„ λΆˆλŸ¬μ˜€λ„λ‘ ν•˜κ² μŠ΅λ‹ˆλ‹€. κ°€μž₯ 기본적인 torch, numpy, pandas와, ν”„λ‘œμ„ΈμŠ€ μƒνƒœλ₯Ό ν‘œμ‹œν•˜λŠ” tqdm, μ‹œκ°ν™” 라이브러리인 pylab, matplotlib 등을 μ‚¬μš©ν•˜κ² μŠ΅λ‹ˆλ‹€.

import torch
import os
import numpy as np
import pandas as pd
from tqdm import tqdm
import seaborn as sns
from pylab import rcParams
import matplotlib.pyplot as plt
from matplotlib import rc
from sklearn.preprocessing import MinMaxScaler
from pandas.plotting import register_matplotlib_converters
from torch import nn, optim

%matplotlib inline
%config InlineBackend.figure_format='retina'

sns.set(style='whitegrid', palette='muted', font_scale=1.2)
rcParams['figure.figsize'] = 14, 10
register_matplotlib_converters()
RANDOM_SEED = 42
np.random.seed(RANDOM_SEED)
torch.manual_seed(RANDOM_SEED)
<torch._C.Generator at 0x7f782b5c3b88>

5.1 데이터셋 λ‹€μš΄λ‘œλ“œ 및 μ „μ²˜λ¦¬ΒΆ

λͺ¨λΈλ§ μ‹€μŠ΅μ„ μœ„ν•΄ λŒ€ν•œλ―Όκ΅­ μ½”λ‘œλ‚˜ λˆ„μ  ν™•μ§„μž 데이터λ₯Ό λΆˆλŸ¬μ˜€κ² μŠ΅λ‹ˆλ‹€. 2.1μ ˆμ— λ‚˜μ˜¨ μ½”λ“œλ₯Ό ν™œμš©ν•˜κ² μŠ΅λ‹ˆλ‹€.

!git clone https://github.com/Pseudo-Lab/Tutorial-Book-Utils
!python Tutorial-Book-Utils/PL_data_loader.py --data COVIDTimeSeries
!unzip -q COVIDTimeSeries.zip
Cloning into 'Tutorial-Book-Utils'...
remote: Enumerating objects: 24, done.
remote: Counting objects: 100% (24/24), done.
remote: Compressing objects: 100% (20/20), done.
remote: Total 24 (delta 6), reused 14 (delta 3), pack-reused 0
Unpacking objects: 100% (24/24), done.
COVIDTimeSeries.zip is done!

pandas 라이브러리λ₯Ό μ΄μš©ν•˜μ—¬ μ½”λ‘œλ‚˜ ν™•μ§„μž 데이터λ₯Ό 뢈러온 ν›„ 3μž₯μ—μ„œ μ‹€μŠ΅ν•œ 데이터 μ „μ²˜λ¦¬λ₯Ό μ‹€μ‹œν•˜κ² μŠ΅λ‹ˆλ‹€. 데이터셋 기간은 2020λ…„ 1μ›” 22일 λΆ€ν„° 2020λ…„ 12μ›” 18일 μž…λ‹ˆλ‹€.

confirmed = pd.read_csv('time_series_covid19_confirmed_global.csv')
confirmed[confirmed['Country/Region']=='Korea, South']
korea = confirmed[confirmed['Country/Region']=='Korea, South'].iloc[:,4:].T
korea.index = pd.to_datetime(korea.index)
daily_cases = korea.diff().fillna(korea.iloc[0]).astype('int')


def create_sequences(data, seq_length):
    xs = []
    ys = []
    for i in range(len(data)-seq_length):
        x = data.iloc[i:(i+seq_length)]
        y = data.iloc[i+seq_length]
        xs.append(x)
        ys.append(y)
    return np.array(xs), np.array(ys)

seq_length = 5
X, y = create_sequences(daily_cases, seq_length)

#ν•™μŠ΅μš©, κ²€μ¦μš©, μ‹œν—˜μš©μœΌλ‘œ 뢄리
train_size = int(327 * 0.8)
X_train, y_train = X[:train_size], y[:train_size]
X_val, y_val = X[train_size:train_size+33], y[train_size:train_size+33]
X_test, y_test = X[train_size+33:], y[train_size+33:]

MIN = X_train.min()
MAX = X_train.max()

def MinMaxScale(array, min, max):

    return (array - min) / (max - min)

#MinMax μŠ€μΌ€μΌλ§
X_train = MinMaxScale(X_train, MIN, MAX)
y_train = MinMaxScale(y_train, MIN, MAX)
X_val = MinMaxScale(X_val, MIN, MAX)
y_val = MinMaxScale(y_val, MIN, MAX)
X_test = MinMaxScale(X_test, MIN, MAX)
y_test = MinMaxScale(y_test, MIN, MAX)

#Tensor ν˜•νƒœλ‘œ λ³€ν™˜
def make_Tensor(array):
    return torch.from_numpy(array).float()

X_train = make_Tensor(X_train)
y_train = make_Tensor(y_train)
X_val = make_Tensor(X_val)
y_val = make_Tensor(y_val)
X_test = make_Tensor(X_test)
y_test = make_Tensor(y_test)
plt.plot(daily_cases.values)
[<matplotlib.lines.Line2D at 0x7f77cc4d1438>]
../../_images/Ch5-CNN-LSTM_10_1.png

5.2 CNN-LSTM λͺ¨λΈ μ •μ˜ΒΆ

5.2.1 1D CNN (1 Dimensional Convolution Neural Network) / Conv1DΒΆ

4μž₯μ—μ„œλŠ” LSTM λͺ¨λΈμ„ μ΄μš©ν•˜μ—¬ ν™•μ§„μž 수 μ˜ˆμΈ‘μ„ ν•˜μ˜€μŠ΅λ‹ˆλ‹€. 이번 μž₯μ—μ„œλŠ” LSTM에 CNN λ ˆμ΄μ–΄λ₯Ό μΆ”κ°€ν•˜μ—¬ μ˜ˆμΈ‘μ„ μ§„ν–‰ν•΄λ³΄κ³ μž ν•©λ‹ˆλ‹€.

CNN λͺ¨λΈμ€ 1D, 2D, 3D둜 λ‚˜λ‰˜λŠ”λ°, 일반적인 CNN은 보톡 이미지 λΆ„λ₯˜μ— μ‚¬μš©λ˜λŠ” 2Dλ₯Ό ν†΅μΉ­ν•©λ‹ˆλ‹€. μ—¬κΈ°μ„œ DλŠ” 차원을 λœ»ν•˜λŠ” dimensional의 μ•½μžλ‘œ, 인풋 데이터 ν˜•νƒœμ— 따라 1D, 2D, 3D ν˜•νƒœμ˜ CNN λͺ¨λΈμ΄ μ‚¬μš©λ©λ‹ˆλ‹€.


κ·Έλ¦Ό 5-1

  • κ·Έλ¦Ό 5-1 μ‹œκ³„μ—΄ 데이터 도식화 (좜처: Understanding 1D and 3D Convolution Neural Network | Keras)

κ·Έλ¦Ό 5-1은 1D CNNμ—μ„œ μ»€λ„μ˜ μ›€μ§μž„μ„ 1차적으둜 μ‹œκ°ν™” ν•œ κ·Έλ¦Όμž…λ‹ˆλ‹€. μ‹œκ°„μ˜ 흐름에 따라 컀널이 였λ₯Έμͺ½μœΌλ‘œ μ΄λ™ν•©λ‹ˆλ‹€. μ‹œκ³„μ—΄ 데이터(Time-Series Data)λ₯Ό λ‹€λ£° λ•Œμ—λŠ” 1D CNN이 μ ν•©ν•©λ‹ˆλ‹€. 1D CNN을 ν™œμš©ν•˜κ²Œ 되면 λ³€μˆ˜ κ°„μ˜ 지엽적인 νŠΉμ§•μ„ μΆ”μΆœν•  수 있게 λ©λ‹ˆλ‹€.

5.2.2 1D CNN ν…ŒμŠ€νŠΈΒΆ

κ·Έλ¦Ό 5-2 κ·Έλ¦Ό 5-3

  • κ·Έλ¦Ό 5-2 & 5-3 1D CNN μ‹œκ°ν™”

κ·Έλ¦Ό 5-2와 5-3은 1μ°¨μ›μ˜ CNN ꡬ쑰λ₯Ό μ‹œκ°ν™” ν•œ κ·Έλ¦Όμž…λ‹ˆλ‹€. κ·Έλ¦Ό 5-2μ—μ„œ 5-3으둜 μ§„ν–‰λ˜λŠ” κ²ƒμ²˜λŸΌ, strideκ°€ 1일 경우 ν•˜λ‚˜μ”© μ΄λ™ν•œλ‹€κ³  λ³΄μ‹œλ©΄ λ©λ‹ˆλ‹€. 이제 κ°„λž΅ν•œ μ½”λ“œλ₯Ό 톡해 1D CNN을 μ‚΄νŽ΄λ³΄λ„λ‘ ν•˜κ² μŠ΅λ‹ˆλ‹€.

μš°μ„  1D CNN λ ˆμ–΄μ΄λ₯Ό μ •μ˜ν•΄μ„œ c에 μ €μž₯ν•©λ‹ˆλ‹€. κ·Έλ¦Ό 5-2 & 5-3처럼 in_channels 1개, out_channels 1개, kernel_size 2개, strideλŠ” 1둜 섀정을 ν•˜μ˜€μŠ΅λ‹ˆλ‹€. 그리고 λ‚˜μ„œ μž…λ ₯ κ°’μœΌλ‘œ ν™œμš©ν•  inputλ³€μˆ˜λ₯Ό μ •μ˜ν•˜κ³  c에 μž…λ ₯ν•΄ μ˜ˆμΈ‘κ°’μ„ μ‚°μΆœν•©λ‹ˆλ‹€.

c = nn.Conv1d(in_channels=1, out_channels=1, kernel_size=2, stride=1)
input = torch.Tensor([[[1,2,3,4,5]]])
output = c(input)
output
tensor([[[-0.3875, -0.8842, -1.3808, -1.8774]]], grad_fn=<SqueezeBackward1>)

5개의 μž…λ ₯ μ˜ˆμ‹œλ“€μ΄ kernel_sizeκ°€ 2인 1D CNN을 ν†΅κ³Όν•˜λ‹ˆ 4개의 값이 μ‚°μΆœλμŠ΅λ‹ˆλ‹€. ν•΄λ‹Ή 값듀이 μ–΄λ–»κ²Œ μ‚°μΆœ λλŠ”μ§€ μ•Œμ•„λ³΄κ² μŠ΅λ‹ˆλ‹€. μš°μ„  c에 μ €μž₯된 weight와 bias 값을 ν™•μΈν•΄λ³΄κ² μŠ΅λ‹ˆλ‹€.

for param in c.parameters():
    print(param)
Parameter containing:
tensor([[[-0.1021, -0.3946]]], requires_grad=True)
Parameter containing:
tensor([0.5037], requires_grad=True)

첫번째둜 λ‚˜μ˜€λŠ” 값은 weight 값을 μ˜λ―Έν•©λ‹ˆλ‹€. kernel_sizeκ°€ 2 μ΄λ―€λ‘œ 총 2개의 weight값이 μ‘΄μž¬ν•©λ‹ˆλ‹€. λ‹€μŒμœΌλ‘œ λ‚˜μ˜€λŠ” 값은 bias κ°’μž…λ‹ˆλ‹€. ν•˜λ‚˜μ˜ 1D CNN λ ˆμ΄μ–΄μ— λŒ€ν•΄ ν•˜λ‚˜μ˜ bias 값이 μ‘΄μž¬ν•©λ‹ˆλ‹€. 이제 ν•΄λ‹Ή 값듀을 각각 w1, w2, b λ³€μˆ˜μ— μ €μž₯ν•΄λ³΄κ² μŠ΅λ‹ˆλ‹€.

w_list = []
for param in c.parameters():
    w_list.append(param)

w = w_list[0]
b = w_list[1]

w1 = w[0][0][0]
w2 = w[0][0][1]

print(w1)
print(w2)
print(b)
tensor(-0.1021, grad_fn=<SelectBackward>)
tensor(-0.3946, grad_fn=<SelectBackward>)
Parameter containing:
tensor([0.5037], requires_grad=True)

인덱싱을 톡해 κ°€μ€‘μΉ˜ 값듀을 각각 w1, w2, b λ³€μˆ˜μ— μ €μž₯ν–ˆμŠ΅λ‹ˆλ‹€. κ·Έλ¦Ό 5-2와 5-3μ—μ„œ \(y1\) κ³Ό \(y2\)λ₯Ό μ‚°μΆœν•  λ•Œ μ‚¬μš©ν•œ 산식을 μ‘μš©ν•˜μ—¬ 1D CNN을 ν†΅κ³Όν–ˆμ„ λ•Œμ˜ λ‚˜μ˜¨ output값을 계산할 수 μžˆμŠ΅λ‹ˆλ‹€. 1D CNN ν•„ν„°κ°€ 3κ³Ό 4λ₯Ό 지날 λ•Œ μ‚°μΆœλ˜λŠ” 값을 κ³„μ‚°ν•˜λ©΄ μ•„λž˜μ™€ κ°™μŠ΅λ‹ˆλ‹€.

w1 * 3 + w2 * 4 + b
tensor([-1.3808], grad_fn=<AddBackward0>)

μ΄λŠ” output의 3번째 κ°’κ³Ό κ°™λ‹€λŠ” 것을 μ•Œ 수 있으며, λ‚˜λ¨Έμ§€ 값듀도 이런 λ°©μ‹μœΌλ‘œ κ³„μ‚°λ˜μ—ˆμŒμ„ μ•Œ 수 μžˆμŠ΅λ‹ˆλ‹€.

output
tensor([[[-0.3875, -0.8842, -1.3808, -1.8774]]], grad_fn=<SqueezeBackward1>)

5.3 CNN-LSTM λͺ¨λΈ 생성¢

이제 CNN-LSTM λͺ¨λΈμ„ μƒμ„±ν•΄λ³΄κ² μŠ΅λ‹ˆλ‹€. 4μž₯μ—μ„œ μƒμ„±ν•œ LSTM λͺ¨λΈκ³Όμ˜ κ°€μž₯ 큰 차이점은 1D CNN λ ˆμ΄μ–΄λ₯Ό μΆ”κ°€ν•œ κ²ƒμž…λ‹ˆλ‹€. μ•„λž˜ μ½”λ“œλ₯Ό λ³΄μ‹œλ©΄, CovidPredictor 클래슀 μ•ˆμ—, nn.Conv1dλ₯Ό 톡해 1D CNN λ ˆμ΄μ–΄λ₯Ό μΆ”κ°€ν•œ 것을 확인할 수 μžˆμŠ΅λ‹ˆλ‹€.

class CovidPredictor(nn.Module):
    def __init__(self, n_features, n_hidden, seq_len, n_layers):
        super(CovidPredictor, self).__init__()
        self.n_hidden = n_hidden
        self.seq_len = seq_len
        self.n_layers = n_layers
        self.c1 = nn.Conv1d(in_channels=1, out_channels=1, kernel_size = 2, stride = 1) # 1D CNN λ ˆμ΄μ–΄ μΆ”κ°€
        self.lstm = nn.LSTM(
            input_size=n_features,
            hidden_size=n_hidden,
            num_layers=n_layers
        )
        self.linear = nn.Linear(in_features=n_hidden, out_features=1)
    def reset_hidden_state(self):
        self.hidden = (
            torch.zeros(self.n_layers, self.seq_len-1, self.n_hidden),
            torch.zeros(self.n_layers, self.seq_len-1, self.n_hidden)
        )
    def forward(self, sequences):
        sequences = self.c1(sequences.view(len(sequences), 1, -1))
        lstm_out, self.hidden = self.lstm(
            sequences.view(len(sequences), self.seq_len-1, -1),
            self.hidden
        )
        last_time_step = lstm_out.view(self.seq_len-1, len(sequences), self.n_hidden)[-1]
        y_pred = self.linear(last_time_step)
        return y_pred

5.4 λͺ¨λΈ ν•™μŠ΅ΒΆ

4μž₯μ—μ„œ κ΅¬μΆ•ν•œ train_model ν•¨μˆ˜λ₯Ό ν™œμš©ν•΄ λͺ¨λΈ ν•™μŠ΅μ„ ν•΄λ³΄κ² μŠ΅λ‹ˆλ‹€. μ˜΅ν‹°λ§ˆμ΄μ €λ‘œλŠ” Adam을 μ„ νƒν•˜μ˜€μŠ΅λ‹ˆλ‹€. ν•™μŠ΅λΉ„μœ¨μ€ 0.001둜 μ„€μ •ν•˜μ˜€μŠ΅λ‹ˆλ‹€. 손싀 ν•¨μˆ˜(Loss Function)λ‘œλŠ” MAE (Mean Absolute Error)λ₯Ό μ„ νƒν–ˆμŠ΅λ‹ˆλ‹€.

def train_model(model, train_data, train_labels, val_data=None, val_labels=None, num_epochs=100, verbose = 10, patience = 10):
    loss_fn = torch.nn.L1Loss() #
    optimiser = torch.optim.Adam(model.parameters(), lr=0.001)
    train_hist = []
    val_hist = []
    for t in range(num_epochs):

        epoch_loss = 0

        for idx, seq in enumerate(train_data): # sample 별 hidden state reset을 ν•΄μ€˜μ•Ό 함 

            model.reset_hidden_state()

            # train loss
            seq = torch.unsqueeze(seq, 0)
            y_pred = model(seq)
            loss = loss_fn(y_pred[0].float(), train_labels[idx]) # 1개의 step에 λŒ€ν•œ loss

            # update weights
            optimiser.zero_grad()
            loss.backward()
            optimiser.step()

            epoch_loss += loss.item()

        train_hist.append(epoch_loss / len(train_data))

        if val_data is not None:

            with torch.no_grad():

                val_loss = 0

                for val_idx, val_seq in enumerate(val_data):

                    model.reset_hidden_state() #seq λ³„λ‘œ hidden state μ΄ˆκΈ°ν™” 

                    val_seq = torch.unsqueeze(val_seq, 0)
                    y_val_pred = model(val_seq)
                    val_step_loss = loss_fn(y_val_pred[0].float(), val_labels[val_idx])

                    val_loss += val_step_loss
                
            val_hist.append(val_loss / len(val_data)) # val hist에 μΆ”κ°€

            ## verbose 번째 λ§ˆλ‹€ loss 좜λ ₯ 
            if t % verbose == 0:
                print(f'Epoch {t} train loss: {epoch_loss / len(train_data)} val loss: {val_loss / len(val_data)}')

            ## patience 번째 λ§ˆλ‹€ early stopping μ—¬λΆ€ 확인
            if (t % patience == 0) & (t != 0):
                
                ## lossκ°€ μ»€μ‘Œλ‹€λ©΄ early stop
                if val_hist[t - patience] < val_hist[t] :

                    print('\n Early Stopping')

                    break

        elif t % verbose == 0:
            print(f'Epoch {t} train loss: {epoch_loss / len(train_data)}')

            
    return model, train_hist, val_hist
model = CovidPredictor(
    n_features=1,
    n_hidden=4,
    seq_len=seq_length,
    n_layers=1
)

예츑 λͺ¨λΈμ„ κ°„λž΅νžˆ μ‚΄νŽ΄λ³΄λ„λ‘ ν•˜κ² μŠ΅λ‹ˆλ‹€.

print(model)
CovidPredictor(
  (c1): Conv1d(1, 1, kernel_size=(2,), stride=(1,))
  (lstm): LSTM(1, 4)
  (linear): Linear(in_features=4, out_features=1, bias=True)
)

이제 λͺ¨λΈ ν•™μŠ΅μ„ μ§„ν–‰ν•΄λ³΄κ² μŠ΅λ‹ˆλ‹€

model, train_hist, val_hist = train_model(
    model,
    X_train,
    y_train,
    X_val,
    y_val,
    num_epochs=100,
    verbose=10,
    patience=50
)
Epoch 0 train loss: 0.08868540743530025 val loss: 0.04381682723760605
Epoch 10 train loss: 0.03551809817384857 val loss: 0.033296383917331696
Epoch 20 train loss: 0.033714159246412786 val loss: 0.033151865005493164
Epoch 30 train loss: 0.03314930358741047 val loss: 0.03351602330803871
Epoch 40 train loss: 0.03311298256454511 val loss: 0.03455767780542374
Epoch 50 train loss: 0.033384358255242594 val loss: 0.03596664220094681
Epoch 60 train loss: 0.03306851693218524 val loss: 0.035104189068078995
Epoch 70 train loss: 0.03264325369823853 val loss: 0.03546909987926483
Epoch 80 train loss: 0.03269847107237612 val loss: 0.035008616745471954
Epoch 90 train loss: 0.033151885962927306 val loss: 0.034998856484889984

μ‹œκ°ν™”λ₯Ό 톡해 ν›ˆλ ¨ 손싀값(Training Loss)κ³Ό μ‹œν—™ 손싀값(Test Loss)을 μ‚΄νŽ΄λ³΄κ² μŠ΅λ‹ˆλ‹€.

plt.plot(train_hist, label="Training loss")
plt.plot(val_hist, label="Val loss")
plt.legend()
<matplotlib.legend.Legend at 0x7f77c2ac9fd0>
../../_images/Ch5-CNN-LSTM_37_1.png

두 개의 손싀값이 λͺ¨λ‘ μˆ˜λ ΄ν•˜λŠ” 것을 확인할 수 μžˆμŠ΅λ‹ˆλ‹€.

5.5 ν™•μ§„μž 수 예츑¢

λͺ¨λΈ ν•™μŠ΅μ„ λ§ˆμ³€μœΌλ‹ˆ ν™•μ§„μž 수 μ˜ˆμΈ‘μ„ 해보도둝 ν•˜κ² μŠ΅λ‹ˆλ‹€. μ˜ˆμΈ‘ν•  λ•Œλ„ μƒˆλ‘œμš΄ μ‹œν€€μŠ€κ°€ μž…λ ₯될 λ•Œ λ§ˆλ‹€ hidden_stateλŠ” μ΄ˆκΈ°ν™”λ₯Ό ν•΄μ€˜μ•Ό 이전 μ‹œν€€μŠ€μ˜ hidden_stateκ°€ λ°˜μ˜λ˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€. torch.unsqueeze ν•¨μˆ˜λ₯Ό μ‚¬μš©ν•˜μ—¬ μž…λ ₯ λ°μ΄ν„°μ˜ 차원을 늘렀 λͺ¨λΈμ΄ μ˜ˆμƒν•˜λŠ” 3차원 ν˜•νƒœλ‘œ λ§Œλ“€μ–΄μ€λ‹ˆλ‹€. 그리고 예츑된 데이터 내에 μ‘΄μž¬ν•˜λŠ” μŠ€μΉΌλΌκ°’λ§Œ μΆ”μΆœν•˜μ—¬ preds λ¦¬μŠ€νŠΈμ— μΆ”κ°€ν•©λ‹ˆλ‹€.

pred_dataset = X_test

with torch.no_grad():
    preds = []
    for _ in range(len(pred_dataset)):
        model.reset_hidden_state()
        y_test_pred = model(torch.unsqueeze(pred_dataset[_], 0))
        pred = torch.flatten(y_test_pred).item()
        preds.append(pred)
plt.plot(np.array(y_test)*MAX, label = 'True')
plt.plot(np.array(preds)*MAX, label = 'Pred')
plt.legend()
<matplotlib.legend.Legend at 0x7f77c29aafd0>
../../_images/Ch5-CNN-LSTM_41_1.png
def MAE(true, pred):
    return np.mean(np.abs(true-pred))
MAE(np.array(y_test)*MAX, np.array(preds)*MAX)
247.63305325632362

LSTM만 μ‚¬μš©ν•œ λͺ¨λΈμ˜ MAE도 μ•½ 250이 λ‚˜μ™”μ—ˆμŠ΅λ‹ˆλ‹€. μ½”λ‘œλ‚˜ ν™•μ§„μž 데이터에 λŒ€ν•΄ μ„±λŠ₯에 μžˆμ–΄μ„œ 큰 μ°¨μ΄λŠ” μ—†λŠ” 것을 확인할 수 μžˆμŠ΅λ‹ˆλ‹€. LSTMκ³Ό CNN-LSTM의 lossκ°€ λͺ¨λ‘ νŠΉμ • 값에 μˆ˜λ ΄ν–ˆκΈ° λ•Œλ¬Έμ΄λΌκ³  λ³Ό 수 있으며, μ΄λŠ” λͺ¨λΈ ꡬ쑰에 λΉ„ν•΄ μž…λ ₯λ˜λŠ” 데이터가 λ„ˆλ¬΄ κ°„λ‹¨ν•˜κΈ° λ•Œλ¬Έμ΄λΌκ³ λ„ λ³Ό 수 μžˆμŠ΅λ‹ˆλ‹€.

μ΄μƒμœΌλ‘œ μ½”λ‘œλ‚˜19 ν™•μ§„μž 수 μ˜ˆμΈ‘μ„ λŒ€ν•œλ―Όκ΅­ 데이터셋과 CNN-LSTM λͺ¨λΈλ‘œ μ‹€μŠ΅ν•΄λ³΄μ•˜μŠ΅λ‹ˆλ‹€. 이번 νŠœν† λ¦¬μ–Όμ„ ν†΅ν•΄μ„œ 데이터셋 탐색과 데이터셋 μ „μ²˜λ¦¬λΆ€ν„° μ‹œμž‘ν•΄μ„œ LSTMλͺ¨λΈμ„ ν•™μŠ΅ν•˜κ³  μ˜ˆμΈ‘μ„ ν•΄λ³΄μ•˜κ³ , 더 λ‚˜μ•„κ°€ CNN-LSTMλͺ¨λΈλ„ μ‚¬μš©ν•΄λ³΄μ•˜μŠ΅λ‹ˆλ‹€.

μ‹œκ³„μ—΄(Time Series) μ˜ˆμΈ‘μ€ 데이터가 λ§Žμ§€ μ•ŠμœΌλ©΄ 정확도가 λ–¨μ–΄μ§€λŠ”κ²Œ μ‚¬μ‹€μž…λ‹ˆλ‹€. 이번 νŠœν† λ¦¬μ–Όμ—μ„œλŠ” ν™•μ§„μž 수 λ°μ΄ν„°λ§Œ μ‚¬μš©ν•˜μ—¬ λ”₯λŸ¬λ‹ λͺ¨λΈμ„ ν•™μŠ΅ν•΄λ³΄μŠ΅λ‹ˆλ‹€. 이 외에도 λ‹€μ–‘ν•œ 데이터셋을 ν™œμš©ν•˜μ—¬ λ”₯λŸ¬λ‹ λͺ¨λΈμ„ ν•™μŠ΅ν•΄λ³΄μ‹œκΈ° λ°”λžλ‹ˆλ‹€.