문장 입력 이진분류 모델 레시피(순환신경망 모델)

Keras

문장 입력 이진분류 모델 레시피(순환신경망 모델)

이부일 2018. 1. 5. 15:01

# 1. 패키지 불러오기

import matplotlib.pyplot as plt
from keras.datasets import imdb
from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Embedding, Flatten, LSTM
%matplotlib inline

# 2. 데이터 생성하기
max_features = 20000
text_max_words = 200

# 2.1 훈련 데이터, 시험 데이터 불러오기
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)

# 2.2 훈련 데이터와 검증 데이터 분리하기
x_val = x_train[20000:]
y_val = y_train[20000:]
x_train = x_train[:20000]
y_train = y_train[:20000]

# 2.3 데이터 전처리 : 문장 길이 맞추기
x_train = sequence.pad_sequences(x_train, maxlen = text_max_words)
x_val = sequence.pad_sequences(x_val, maxlen = text_max_words)
x_test = sequence.pad_sequences(x_test, maxlen = text_max_words)

# 3. 모델 구성하기

model = Sequential()

model.add(Embedding(max_features, 128))

model.add(LSTM(128))

model.add(Dense(1, activation='sigmoid'))

# 4. 모델 학습과정 설정하기
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# 5. 모델 학습시키기
hist = model.fit(x_train, y_train, epochs = 2, batch_size = 64, validation_data = (x_val, y_val))

# 6. 학습과정 살펴보기
fig, loss_ax = plt.subplots()
acc_ax = loss_ax.twinx()

loss_ax.plot(hist.history["loss"], "blue", label = "Train Loss")
loss_ax.plot(hist.history["val_loss"], "red", label = "Validation Loss")
loss_ax.set_ylim([-0.2, 1.2])

acc_ax.plot(hist.history["acc"], "purple", label = "Train Accuracy")
acc_ax.plot(hist.history["val_acc"], "green", label = "Validation Accuracy")
acc_ax.set_ylim([-0.2, 1.2])

loss_ax.set_xlabel("epoch")
loss_ax.set_ylabel("loss")
acc_ax.set_ylabel("accurach")

loss_ax.legend(loc = "upper left")
acc_ax.legend(loc = "lower left")

plt.show()

자동 대체 텍스트를 사용할 수 없습니다.

# 7. 모델 평가하기
loss_and_metrics = model.evaluate(x_test, y_test, batch_size = 64)
print("## Evaluation Loss and Metrics ##")
print(loss_and_metrics)

자동 대체 텍스트를 사용할 수 없습니다.

[출처] 블록과 함께하는 파이썬 딥러닝 케라스, 김태영 지음, DigitalBooks, p279~281

저작자표시 변경금지 (새창열림)