Statsmodels 다중회귀분석

1 개요[ | ]

Statsmodels 다중회귀분석

2 예시 1: 광고 기억률[ | ]

Python

Copy

import pandas as pd
df = pd.DataFrame({
'radio_ads': [3,4,9,4,5,5,2,6,5,3],
'tv_ads':    [1,3,4,1,4,1,4,2,4,2],
'retention': [5,1,6,2,8,3,4,9,7,4],
})

X = df[['radio_ads','tv_ads']]
y = df['retention']

import statsmodels.api as sm
X = sm.add_constant(X)
model = sm.OLS(y, X)
result = model.fit()

print( result.params )
print( "R²=", result.rsquared )

Loading

→ 회귀식 [math]\displaystyle{ y = 0.472477 x_1 - 0.522936 x_2 + 1.366972 }[/math]

→ 결정계수 [math]\displaystyle{ R^2 = 0.25166839909010097 }[/math]

3 예시 2: 빵집 매출[ | ]

Python

Copy

import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/jmnote/zdata/master/multiple-regression/bakery-sales.csv')
df.head()

Loading

Copy

X = df[['floor_space','distance_to_station']]
y = df['sales']

import statsmodels.api as sm
X = sm.add_constant(X)
model = sm.OLS(y, X)
result = model.fit()

print( result.params )
print( "R²=", result.rsquared )

Loading

→ 회귀식 [math]\displaystyle{ y = 41.513478 x_1 - 0.340883 x_2 + 65.323916 }[/math]

→ 결정계수 [math]\displaystyle{ R^2 = 0.945235852681711 }[/math]

4 예시 3: Boston[ | ]

Python

Copy

from sklearn.datasets import load_boston
import pandas as pd
boston = load_boston()
X = pd.DataFrame(boston.data, columns=boston.feature_names)
y = boston.target

import statsmodels.api as sm
X = sm.add_constant(X)
model = sm.OLS(y, X)
result = model.fit()
print( result.summary() )

Loading

5 같이 보기[ | ]