-
iris dataset으로 지도학습(KNN) / 비지도학습(K-Means) 비교Python 데이터 분석 2022. 11. 25. 17:30
iris 데이터로 지도학습(KNN) / 비지도학습(K-Means) 비교했다.
# iris dataset으로 지도학습(KNN) / 비지도학습(K-Means) from sklearn.datasets import load_iris iris_dataset = load_iris() print(iris_dataset['data'][:3]) print(iris_dataset['feature_names']) # train / test sprit from sklearn.model_selection import train_test_split # train / test split (7 : 3) x_train, x_test, y_train, y_test = train_test_split(iris_dataset['data'], iris_dataset['target'], test_size = 0.25, random_state = 42) print(x_train.shape, x_test.shape, y_train.shape, y_test.shape) # (112, 4) (38, 4) (112,) (38,) # 지도학습(KNN) from sklearn.neighbors import KNeighborsClassifier from sklearn import metrics knnmodel = KNeighborsClassifier(n_neighbors=5) knnmodel.fit(x_train, y_train) # feature, label predict_label = knnmodel.predict(x_test) print('예측값 :', predict_label) print('실제값 :', y_test) print('acc :', metrics.accuracy_score(y_test, predict_label)) print() # 비지도학습(K-Means) from sklearn.cluster import KMeans kmeansModel = KMeans(n_clusters=3, init='k-means++', random_state = 0) kmeansModel.fit(x_train) print(kmeansModel.labels_) print('o cluster :', y_train[kmeansModel.labels_ == 0]) print('o cluster :', y_train[kmeansModel.labels_ == 1]) print('o cluster :', y_train[kmeansModel.labels_ == 2]) pred_cluster = kmeansModel.predict(x_test) print('pred_cluster :', pred_cluster) import numpy as np np_arr = np.array(pred_cluster) pred_label = np_arr.tolist() print(pred_label) print('test acc : {:.2f}'.format(np.mean(pred_label == y_test))) <console> [[5.1 3.5 1.4 0.2] [4.9 3. 1.4 0.2] [4.7 3.2 1.3 0.2]] ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)'] (112, 4) (38, 4) (112,) (38,) 예측값 : [1 0 2 1 1 0 1 2 1 1 2 0 0 0 0 1 2 1 1 2 0 2 0 2 2 2 2 2 0 0 0 0 1 0 0 2 1 0] 실제값 : [1 0 2 1 1 0 1 2 1 1 2 0 0 0 0 1 2 1 1 2 0 2 0 2 2 2 2 2 0 0 0 0 1 0 0 2 1 0] acc : 1.0 [0 0 1 1 1 0 0 1 1 2 1 2 1 2 1 0 2 1 0 0 0 1 1 0 0 0 1 0 1 2 0 1 1 0 1 1 1 1 2 1 0 1 2 0 0 1 2 0 1 0 0 1 1 2 1 2 2 1 0 0 1 2 0 0 0 1 2 0 2 2 0 1 1 1 2 2 0 2 1 2 1 1 1 0 1 1 0 1 2 2 0 1 2 2 0 2 0 2 2 2 1 2 1 1 1 1 0 1 1 0 1 2] o cluster : [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] o cluster : [2 1 1 1 2 1 1 1 1 1 2 1 1 1 2 2 2 1 1 1 1 1 2 1 1 1 1 2 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 2 2 1 2 1] o cluster : [2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 1 2 2 2 2] pred_cluster : [1 0 2 1 1 0 1 2 1 1 2 0 0 0 0 1 2 1 1 2 0 1 0 2 2 2 2 2 0 0 0 0 1 0 0 1 1 0] [1, 0, 2, 1, 1, 0, 1, 2, 1, 1, 2, 0, 0, 0, 0, 1, 2, 1, 1, 2, 0, 1, 0, 2, 2, 2, 2, 2, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0] test acc : 0.95
'Python 데이터 분석' 카테고리의 다른 글
Python 데이터분석 기초 76 - 밀도 기반 클러스터링(DBSCAN) (0) 2022.11.28 Python 데이터분석 기초 75 - K-means Clustering(비계층적 군집분석) (1) 2022.11.25 Python 데이터분석 기초 74 - Clustering(군집화) - 계층 군집분석 - data(iris) (0) 2022.11.25 Python 데이터분석 기초 73 - Clustering(군집화) - 비계층 군집분석 (0) 2022.11.25 MLP(multi-layer perceptron) - 다층 신경망 예제, breast_cancer dataset, 표준화 (0) 2022.11.25