ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • Python 데이터분석 기초 50 - 귀납적 추론, 연역적 추론, 단순선형회귀 예제(mtcars), 키보드로 값 받기
    Python 데이터 분석 2022. 11. 15. 15:49

     

     

    # mtcars dataset으로 단순/다중회귀 모델 작성 : ols() 사용
    
    import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt
    import statsmodels.api
    plt.rc('font', family = 'malgun gothic')
    import seaborn as sns
    import statsmodels.formula.api as smf
    
    mtcars = statsmodels.api.datasets.get_rdataset('mtcars').data
    print(mtcars.head(3))
    # print(mtcars.corr())
    print(np.corrcoef(mtcars.hp,mtcars.mpg)[0,1]) # -0.7761683718265864
    print(np.corrcoef(mtcars.wt,mtcars.mpg)[0,1]) # -0.8676593765172281
    
    # 단순선형회귀 : mtcars.hp(feature, x), mtcars.mpg(label, y)
    # 시각화
    # plt.scatter(mtcars.hp, mtcars.mpg)
    # # 참고 : numpy의 polyfit()을 이용하면 slope, intercept를 얻을 수 있다.
    # slope, intercept = np.polyfit(mtcars.hp, mtcars.mpg, 1)
    # print('slope : {}, intercept : {}'.format(slope, intercept)) # slope : -0.06822, intercept : 30.098860
    # plt.plot(mtcars.hp, slope * mtcars.hp + intercept)
    # plt.xlabel('마력수')
    # plt.ylabel('연비')
    # plt.show()
    
    result1 = smf.ols('mpg ~ hp', data = mtcars).fit()
    print(result1.summary())
    print(result1.conf_int(alpha = 0.05))
    print()
    print(result1.summary().tables[1])
    
    print('마력수 110에 대한 연비는 ', -0.088895 * 110 + 30.0989) # coef(hp) * 예측값 + coef(intercept)
    print('마력수 50에 대한 연비는 ', -0.088895 * 50 + 30.0989)
    print('마력수 200에 대한 연비는 ', -0.088895 * 200 + 30.0989)
    
    print('------------')
    # 다중선형회귀 : mtcars.hp(feature, x), mtcars.mpg(label, y)
    result2 = smf.ols(formula = 'mpg ~ hp + wt', data = mtcars).fit()
    print(result2.summary())
    print(result2.summary().tables[1])
    print('마력수 110, 차체 무게 5톤에 대한 연비는 :', (-0.0318 * 110) + (-3.8778 * 5) + 37.2273)
    
    print('predict 함수 사용')
    new_data = pd.DataFrame({'hp':[110, 120, 150],'wt':[5, 2, 7]})
    new_pred = result2.predict(new_data)
    print('예상 연비 :', new_pred.values)
    
    # 키보드로 값 받기
    new_hp = float(input('새로운 마력수 : '))
    new_wt = float(input('새로운 차체무게 : '))
    new_data2 = pd.DataFrame({'hp':[new_hp],'wt':[new_wt]})
    new_pred2 = result2.predict(new_data2)
    print('예상 연비 :', new_pred2.values)
    
    
    <console>
                    mpg  cyl   disp   hp  drat     wt   qsec  vs  am  gear  carb
    Mazda RX4      21.0    6  160.0  110  3.90  2.620  16.46   0   1     4     4
    Mazda RX4 Wag  21.0    6  160.0  110  3.90  2.875  17.02   0   1     4     4
    Datsun 710     22.8    4  108.0   93  3.85  2.320  18.61   1   1     4     1
    -0.7761683718265864
    -0.8676593765172281
                                OLS Regression Results                            
    ==============================================================================
    Dep. Variable:                    mpg   R-squared:                       0.602
    Model:                            OLS   Adj. R-squared:                  0.589
    Method:                 Least Squares   F-statistic:                     45.46
    Date:                Tue, 15 Nov 2022   Prob (F-statistic):           1.79e-07
    Time:                        16:05:23   Log-Likelihood:                -87.619
    No. Observations:                  32   AIC:                             179.2
    Df Residuals:                      30   BIC:                             182.2
    Df Model:                           1                                         
    Covariance Type:            nonrobust                                         
    ==============================================================================
                     coef    std err          t      P>|t|      [0.025      0.975]
    ------------------------------------------------------------------------------
    Intercept     30.0989      1.634     18.421      0.000      26.762      33.436
    hp            -0.0682      0.010     -6.742      0.000      -0.089      -0.048
    ==============================================================================
    Omnibus:                        3.692   Durbin-Watson:                   1.134
    Prob(Omnibus):                  0.158   Jarque-Bera (JB):                2.984
    Skew:                           0.747   Prob(JB):                        0.225
    Kurtosis:                       2.935   Cond. No.                         386.
    ==============================================================================
    
    Notes:
    [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
                       0          1
    Intercept  26.761949  33.435772
    hp         -0.088895  -0.047562
    
    ==============================================================================
                     coef    std err          t      P>|t|      [0.025      0.975]
    ------------------------------------------------------------------------------
    Intercept     30.0989      1.634     18.421      0.000      26.762      33.436
    hp            -0.0682      0.010     -6.742      0.000      -0.089      -0.048
    ==============================================================================
    마력수 110에 대한 연비는  20.32045
    마력수 50에 대한 연비는  25.65415
    마력수 200에 대한 연비는  12.3199
    ------------
                                OLS Regression Results                            
    ==============================================================================
    Dep. Variable:                    mpg   R-squared:                       0.827
    Model:                            OLS   Adj. R-squared:                  0.815
    Method:                 Least Squares   F-statistic:                     69.21
    Date:                Tue, 15 Nov 2022   Prob (F-statistic):           9.11e-12
    Time:                        16:05:23   Log-Likelihood:                -74.326
    No. Observations:                  32   AIC:                             154.7
    Df Residuals:                      29   BIC:                             159.0
    Df Model:                           2                                         
    Covariance Type:            nonrobust                                         
    ==============================================================================
                     coef    std err          t      P>|t|      [0.025      0.975]
    ------------------------------------------------------------------------------
    Intercept     37.2273      1.599     23.285      0.000      33.957      40.497
    hp            -0.0318      0.009     -3.519      0.001      -0.050      -0.013
    wt            -3.8778      0.633     -6.129      0.000      -5.172      -2.584
    ==============================================================================
    Omnibus:                        5.303   Durbin-Watson:                   1.362
    Prob(Omnibus):                  0.071   Jarque-Bera (JB):                4.046
    Skew:                           0.855   Prob(JB):                        0.132
    Kurtosis:                       3.332   Cond. No.                         588.
    ==============================================================================
    
    Notes:
    [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
    ==============================================================================
                     coef    std err          t      P>|t|      [0.025      0.975]
    ------------------------------------------------------------------------------
    Intercept     37.2273      1.599     23.285      0.000      33.957      40.497
    hp            -0.0318      0.009     -3.519      0.001      -0.050      -0.013
    wt            -3.8778      0.633     -6.129      0.000      -5.172      -2.584
    ==============================================================================
    마력수 110, 차체 무게 5톤에 대한 연비는 : 14.3403
    predict 함수 사용
    예상 연비 : [14.34309224 25.65885499  5.31651287]
    새로운 마력수 : 80
    새로운 차체무게 : 8
    예상 연비 : [3.66278842]

     

     

    mtcars.hp, mtcars.mpg를 이용하여 시각화

     

    댓글

Designed by Tistory.