Pandas 그룹 내에서 시간 인접한 것끼리 그룹 지정하기

1 개요[ | ]

Pandas 그룹 내에서 시간 인접한 것끼리 그룹 지정하기
import numpy as np
import pandas as pd

df = pd.DataFrame([
['2017-01-01T00:00:00','123.45.67.89'],
['2017-01-01T00:00:30','123.45.67.89'],
['2017-01-01T00:30:00','100.45.67.89'],
['2017-01-01T00:30:30','123.45.67.89'],
['2017-01-01T00:31:00','123.45.67.89'],
['2017-01-01T00:31:30','123.45.67.89'],
['2017-01-01T01:00:00','123.45.67.89'],
['2017-01-01T01:00:30','100.45.67.89'],
['2017-01-01T03:00:00','123.45.67.89'],
['2017-01-01T03:00:30','100.45.67.89'],
['2017-01-01T03:05:00','100.45.67.89'],
['2017-01-01T03:05:30','123.45.67.89'],
],columns=['timestamp','ip'])
df.timestamp = pd.to_datetime(df.timestamp)
df
df['diff_seconds'] = df.groupby('ip')['timestamp'].diff()/np.timedelta64(1,'s')
df['local_group_id'] = (df['diff_seconds']>3600).ne(0).cumsum()+1
df['group_id'] = df['ip'] + '--' + (df['local_group_id']).astype(str)
df
import matplotlib.pyplot as plt
import seaborn as sns
fig = plt.figure(figsize=(12, 5))
sns.countplot(data=df, x='group_id').plot()

2 같이 보기[ | ]

문서 댓글 ({{ doc_comments.length }})
{{ comment.name }} {{ comment.created | snstime }}