Sklearn 다중회귀분석

Jmnote (토론 | 기여)님의 2021년 10월 1일 (금) 17:12 판 (→‎예시 2: 빵집 매출)
(차이) ← 이전 판 | 최신판 (차이) | 다음 판 → (차이)

1 개요[ | ]

sklearn 다중회귀분석

2 예시 1: 광고 기억률[ | ]

import pandas as pd
df = pd.DataFrame({
'radio_ads': [3,4,9,4,5,5,2,6,5,3],
'tv_ads':    [1,3,4,1,4,1,4,2,4,2],
'retention': [5,1,6,2,8,3,4,9,7,4],
})

X = df[['radio_ads','tv_ads']]
Y = df['retention']

from sklearn.linear_model import LinearRegression
reg = LinearRegression().fit(X, Y)
print( "coefficient=", reg.coef_ )
print( "intercept=", reg.intercept_ )
print( "R²=", reg.score(X, Y) )
→ 회귀식 [math]\displaystyle{ y = 0.47247706 x_1 - 0.52293578 x_2 + 1.3669724770642206 }[/math]
→ 결정계수 [math]\displaystyle{ R^2 = 0.2516683990901011 }[/math]

3 예시 2: 빵집 매출[ | ]

import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/jmnote/zdata/master/multiple-regression/bakery-sales.csv')
print( df )

X = df[['floor_space','distance_to_station']]
Y = df['sales']

from sklearn.linear_model import LinearRegression
reg = LinearRegression().fit(X, Y)
print( "coefficient=", reg.coef_ )
print( "intercept=", reg.intercept_ )
print( "R²=", reg.score(X, Y) )
→ 회귀식 [math]\displaystyle{ y = 41.51347826 x_1 - 0.34088269 x_2 + 65.32391638894836 }[/math]
→ 결정계수 [math]\displaystyle{ R^2 = 0.9452358526817111 }[/math]

4 예시 3: Boston[ | ]

from sklearn.datasets import load_boston
boston = load_boston()
X = boston.data
y = boston.target

from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(X, y)

import pandas as pd
print( pd.DataFrame(model.coef_, index=boston.feature_names, columns=['coef']) )
print( "---" )
print( "SS(coef)=", sum(model.coef_**2) )
print( "R²=", model.score(X, y) )

5 같이 보기[ | ]

문서 댓글 ({{ doc_comments.length }})
{{ comment.name }} {{ comment.created | snstime }}