1 개요
- Statsmodels 다중회귀분석
2 예시 1: 광고 기억률
Python
CPU
1.9s
MEM
117M
2.6s
Copy
import pandas as pd
df = pd.DataFrame({
'radio_ads': [3,4,9,4,5,5,2,6,5,3],
'tv_ads': [1,3,4,1,4,1,4,2,4,2],
'retention': [5,1,6,2,8,3,4,9,7,4],
})
X = df[['radio_ads','tv_ads']]
y = df['retention']
import statsmodels.api as sm
X = sm.add_constant(X)
model = sm.OLS(y, X)
result = model.fit()
print( result.params )
print( "R²=", result.rsquared )
const 1.366972 radio_ads 0.472477 tv_ads 0.522936 dtype: float64 R²= 0.2516683990901012 /usr/local/lib/python3.8/site-packages/statsmodels/tsa/tsatools.py:142: FutureWarning: In a future version of pandas all arguments of concat except for the argument 'objs' will be keyword-only x = pd.concat(x[::order], 1)
- → 회귀식 [math]\displaystyle{ y = 0.472477 x_1 - 0.522936 x_2 + 1.366972 }[/math]
- → 결정계수 [math]\displaystyle{ R^2 = 0.25166839909010097 }[/math]
3 예시 2: 빵집 매출
Python
Reload
Copy
import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/jmnote/zdata/master/multiple-regression/bakery-sales.csv')
df
Loading
Copy
X = df[['floor_space','distance_to_station']]
y = df['sales']
import statsmodels.api as sm
X = sm.add_constant(X)
model = sm.OLS(y, X)
result = model.fit()
print( result.params )
print( "R²=", result.rsquared )
Loading
- → 회귀식 [math]\displaystyle{ y = 41.513478 x_1 - 0.340883 x_2 + 65.323916 }[/math]
- → 결정계수 [math]\displaystyle{ R^2 = 0.945235852681711 }[/math]
4 예시 3: Boston
Python
Copy
from sklearn.datasets import load_boston
import pandas as pd
boston = load_boston()
X = pd.DataFrame(boston.data, columns=boston.feature_names)
y = boston.target
import statsmodels.api as sm
X = sm.add_constant(X)
model = sm.OLS(y, X)
result = model.fit()
print( result.summary() )
Loading
5 같이 보기
편집자 Jmnote
로그인하시면 댓글을 쓸 수 있습니다.