Sklearn 로지스틱회귀분석

1 개요[ | ]

Sklearn 로지스틱회귀분석

2 예시 1: 공부시간과 합격확률[ | ]

Python

CPU

1.7s

MEM

105M

2.3s

Copy

import pandas as pd
df = pd.DataFrame({
'hours': [0.50,0.75,1.00,1.25,1.50,1.75,1.75,2.00,2.25,2.50,2.75,3.00,3.25,3.50,4.00,4.25,4.50,4.75,5.00,5.50],
'pass': [0,0,0,0,0,0,1,0,1,0,1,0,1,0,1,1,1,1,1,1],
})

X = df[['hours']]
Y = df['pass']

from sklearn.linear_model import LogisticRegression
reg = LogisticRegression(C=100000).fit(X, Y)
print( reg.coef_ )
print( reg.intercept_ )
print( "R²=", reg.score(X, Y) )

[[1.50463927]]
[-4.07770207]
R²= 0.8

→ 회귀식 [math]\displaystyle{ y = \dfrac{1}{1 + \exp(-(1.50463927 x_1 - 4.07770207))} }[/math]

3 예시 2: 스페셜 판매확률[ | ]

Python

Copy

import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/jmnote/zdata/master/logistic-regression/special-sales.csv')
print( df )

Y = df['special_sales']
X = df[['busy_day','high_temperature']]

from sklearn.linear_model import LogisticRegression
reg = LogisticRegression(C=100000).fit(X, Y)
print( reg.coef_ )
print( reg.intercept_ )
print( "R²=", reg.score(X, Y) )

Loading

→ 회귀식 [math]\displaystyle{ y = \dfrac{1}{1 + \exp(-(2.44261279 x_1 + 0.54450301 x_2 - 15.20342824))} }[/math]

4 예시 3: 유방암 판정[ | ]

Python

Copy

from sklearn.datasets import load_breast_cancer
breast_cancer = load_breast_cancer()
X = breast_cancer.data
y = breast_cancer.target

from sklearn.linear_model import LogisticRegression
model = LogisticRegression(solver='newton-cg')
result = model.fit(X, y)

for f, w in zip(breast_cancer.feature_names, result.coef_[0]):
  print(f"{f:<25} {w}")

Loading

5 같이 보기[ | ]