본문 바로가기
Dev/딥러닝

05. Tensor Flow 로 Classification 예제 (binary)

by bsion 2018. 8. 17.
05. Logistic Classification

출처 : 모두를위한 머신러닝 (http://hunkim.github.io/ml/)


이론




Logistic Regression


In [1]:
import tensorflow as tf

x_data = [[1, 2], [2, 3], [3, 1], [4, 3], [5, 3], [6, 2]]
y_data = [[0], [0], [0], [1], [1], [1]]

# placeholder for a tensor will be fed
X = tf.placeholder(tf.float32, shape=[None, 2])  # n개 일때 None
Y = tf.placeholder(tf.float32, shape=[None, 1])

# X: nx2,  Y: nx1  --> X * W = Y + b  --> W 는 2x1,  b는 1 (b는 항상 나가는값 갯수)
W = tf.Variable(tf.random_normal([2, 1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')

# sigmoid : tf.div(1., 1. + tf.exp(tf.matmul(X, W) + b))
hypothesis = tf.sigmoid(tf.matmul(X, W) + b)

cost = -tf.reduce_mean(Y * tf.log(hypothesis) + 
                       (1 - Y) * tf.log(1 - hypothesis))

train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)

# True if hypothesis >0.5, else False (dype을 float 로 cast 하면 1. or 0. 이 된다.)
predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32)
accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32))

# Launch
with tf.Session() as sess:
    # Initialize TensorFlow variables
    sess.run(tf.global_variables_initializer())
    
    for step in range(10001):
        cost_val, _ = sess.run([cost, train], feed_dict={X: x_data, Y: y_data})
        if step % 500 == 0:
            print(step, cost_val)
            
    h, c, a = sess.run([hypothesis, predicted, accuracy],
                       feed_dict={X: x_data, Y: y_data})
    
    print("\nHypothesis: ", h, "\nCorrect (Y): ", c, "\nAccuarcy: ", a)
0 2.90642
500 0.496238
1000 0.400877
1500 0.362029
2000 0.333676
2500 0.309708
3000 0.288714
3500 0.270127
4000 0.253586
4500 0.238807
5000 0.225552
5500 0.213618
6000 0.202833
6500 0.193051
7000 0.184146
7500 0.176012
8000 0.168557
8500 0.161703
9000 0.155383
9500 0.149539
10000 0.14412

Hypothesis:  [[ 0.02851192]
 [ 0.15571363]
 [ 0.29401827]
 [ 0.78638226]
 [ 0.94268227]
 [ 0.98121649]] 
Correct (Y):  [[ 0.]
 [ 0.]
 [ 0.]
 [ 1.]
 [ 1.]
 [ 1.]] 
Accuarcy:  1.0

Classifying diabetes

In [7]:
import numpy as np
import tensorflow as tf

# Load Data
xy = np.loadtxt('./static/data-03-diabetes.csv', delimiter=',', dtype=np.float32)
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]

# placeholder for a tensor will be fed
X = tf.placeholder(tf.float32, shape=[None, 8])  # n개 일때 None
Y = tf.placeholder(tf.float32, shape=[None, 1])

# X: nx2,  Y: nx1  --> X * W = Y + b  --> W 는 2x1,  b는 1 (b는 항상 나가는값 갯수)
W = tf.Variable(tf.random_normal([8, 1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')

# sigmoid : tf.div(1., 1. + tf.exp(tf.matmul(X, W) + b))
hypothesis = tf.sigmoid(tf.matmul(X, W) + b)

cost = -tf.reduce_mean(Y * tf.log(hypothesis) + 
                       (1 - Y) * tf.log(1 - hypothesis))

train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)

# True if hypothesis >0.5, else False (dype을 float 로 하면 1. or 0.)
predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32)
accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32))

# Launch
with tf.Session() as sess:
    # Initialize TensorFlow variables
    sess.run(tf.global_variables_initializer())
    
    for step in range(10001):
        cost_val, _ = sess.run([cost, train], feed_dict={X: x_data, Y: y_data})
        if step % 1000 == 0:
            print(step, cost_val)
            
    h, c, a = sess.run([hypothesis, predicted, accuracy],
                       feed_dict={X: x_data, Y: y_data})
    
    print("\nAccuarcy: ", a)
0 1.18169
1000 0.521999
2000 0.50858
3000 0.499301
4000 0.492688
5000 0.487878
6000 0.484319
7000 0.481645
8000 0.479611
9000 0.478046
10000 0.476829

Accuarcy:  0.773386


댓글