Gradient Descent 직접 구현하기

BangPro 2024. 3. 19. 16:14

2024. 3. 19. 16:14

728x90

오늘은 Linear Regression 문제에서 Gradient descent를 직접 구현해보자

데이터셋 로드

오늘의 데이터셋은 사이킷런의 당뇨병 데이터셋이다

총 10가지의 feature 열이 있는데 그중에서 bmi만을 가지고 실습을 진행한다.

아래의 코드를 통해 데이터셋을 로드하고 훈련 데이터셋과 테스트 데이터셋으로 나눈다.

import matplotlib.pylab as plt
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn import datasets

# Load the diabetes dataset.
diabetes_X, diabetes_y = datasets.load_diabetes(return_X_y=True)

# Select only one feature (BMI) and make into a 2-D array. The index of BMI feature is 2.
diabetes_X_new = diabetes_X[:, np.newaxis, 2]

# Separate training data from test data.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(diabetes_X_new, diabetes_y, test_size=0.1, random_state=0)

Gradient Descent 식

참고로 Gradient Descent의 식은 다음과 같다.
$$W^{n+1} = W^n - r_{W} \frac{\delta_{Loss}}{\delta_{W}} $$

그리고 Loss 값을 구하는 식은 다음과 같다.
$$ (y-f(x))^2 $$
$$ Loss(W,b) = \sum (y-f(x))^2 $$

따라서 우리는 Loss 식을 각각 W, b에 대해 편미분을 하고 GD 식에 적용하면 된다.
이제 코드로 구현해보자

코드 구현

# Train W and b using the training data only
# Use X_train and y_train only

W = np.random.rand()        # Initialization of W
b = np.random.rand()        # Initialization of b

epochs = 25000              # number of epochs
n = float(len(X_train))    # number of training samples
lr = 0.1                   # learning rate

train_loss = []
for k in range(epochs):
    y_pred = W * X_train + b
    loss = np.square(y_train-y_pred)
    loss = loss / n
    train_loss.append(np.sum(loss))

    dW = (-1/n) * np.sum((y_train-y_pred) * X_train)
    db = (-1/n) * np.sum(y_train-y_pred) / n

    W = W - lr * dW
    b = b - lr * db

위 코드를 실행하면 상당히 빠른 시간안에 25000번의 epoch을 돌고 W,b를 구한다.
그 값을 확인해보자

print("Trained parameters")
print("W:", W,", b:", b)

나의 경우에는 다음과 같은 값이 출력되었다.

Trained parameters
W: 963.1825770214525 , b: 150.92371110031215

데이터 시각화

이제 훈련데이터의 데이터 포인트와 우리의 Linear Regression 함수를 좌표계에 시각화하자

# Checking for traing: Using training data
# use X_train and y_train

y_pred = W*X_train + b
plt.scatter(X_train, y_train,  color='black')
plt.plot(X_train,y_pred, color='blue', linewidth=3)
plt.title("Linear Regression Training Results")
plt.show()

테스트 데이터의 데이터 포인트로 시각화를 한다.

# Prediction: Using only the test data
# use X_test and y test only

y_pred = W*X_test + b
plt.scatter(X_test, y_test,  color='black')
plt.plot(X_test,y_pred, color='blue', linewidth=3)
plt.title("Linear Regression Test")
plt.show()

이제 train loss값을 시각화하자. 에폭이 지나면서 추이를 살피면 학습이 잘 되는지 파악할 수 있다.
우하향하여 진동 없이 수렴하면 베스트

# Display the loss at every epoch (sum of squares error on your training data)
plt.plot(np.arange(epochs), train_loss,  color='black')
plt.title("Loss vs epochs")
plt.xlabel("epochs")
plt.ylabel("Training Loss")
plt.show()

'인공지능 > 인공지능 기초 개념' 카테고리의 다른 글

기초 개념편 (1) Machine Learning이란? (0)	2024.04.13
Tensor에 대해 (0)	2024.03.19
(7)-2 비지도학습 실습 (4)	2024.03.15
(7) 비지도 학습 (1)	2024.03.15
(6)-2 지도학습 실습 (2)	2024.03.15

방프로의 기술 블로그