Ch11. 그룹 연산

Notice

Recent Posts

Recent Comments

Link

250x250

Tags more

Archives

관리 메뉴

BASHA TECH

AI/Pandas

Basha 2022. 9. 29. 18:17

728x90

11-1. 데이터 집계

- 데이터 집계하기 - groupby 메소드

- 분할-반영-결합 과정 살펴보기 - groupby 메소드

- groupby 메소드와 함께 사용하는 집꼐 메소드

- agg 메소드로 사용자 함수와 groupby 메소드 조합하기

- 여러 개의 집계 메소드 한번에 사용하기

11-2. 데이터 변환

- 표준점수 계산하기

- 누락값을 평균값으로 처리하기

11-3. 데이터 필터링

11-4. 그룹 오브젝트

- 그룹 오브젝트 살펴보기

- 한 번에 그룹 오브젝트 계산하기

- 그룹 오브젝트 활용하기

- 여러 열을 사용해 그룹 오브젝트 만들고 계산하기

# apply method
import numpy as np
import pandas as pd

df = pd.read_csv('../data/gapminder.tsv', sep='\t')
df.head()

# 연도별 lifeException 평균
df.groupby(by='year')

df.groupby(by='year')['lifeExp']

df.groupby(by='year')['lifeExp'].mean()

years = df['year'].unique()
years # ndarray임 list아님

y1952 = df.loc[df.year == 1952,:]
y1952

y1952.lifeExp

y1952.lifeExp.mean()

years

for year in years:
    print(year)

for year in years:
    val = df.loc[df.year == year, :]['lifeExp'].mean()
    print(str(year) + ' : ' + str(val))
    # print(year, val)

df.groupby('year').lifeExp.mean()

import seaborn as sns

tips_10 = sns.load_dataset('tips').sample(10)
tips_10

tips_10 = sns.load_dataset('tips').sample(10, random_state=42) 
# random_state를 적용하니까 랜덤이 바뀌지 않음.
tips_10

grouped = tips_10.groupby('sex')
grouped

grouped.groups

grouped.groups['Male'] #ndarray

tips_10.iloc[:]

tips_10

tips_10.groupby(['sex','time'])

bill_sex_time = tips_10.groupby(['sex','time'])

group_avg = bill_sex_time.mean()
group_avg

group_avg.columns

group_avg.index # multi index

group_avg.index[2][0]

tips_10.groupby(['sex','time']).mean().reset_index()

tips_10.groupby(['sex','time'], as_index=True)

tips_10.groupby(['sex','time'], as_index=True).mean()

728x90

'AI/Pandas' Related Articles

Comments