[python] 파이썬 판다스 - python pandas

[python] 파이썬 판다스 - python pandas - pd.cut()

2020. 10. 12. 15:58

728x90

pd.cut()

- 데이터 값을 분할 하고 정렬하려는 경우 사용하는 판다스 함수

- 카테고리 범주 별로 구분하거나 groupby() 와 같이 사용 하여 각 범주별 특정 값들을 구하는데 사용할 수 있다.

예제

- numpy 배열 생성

import pandas as pd
import numpy as np

data = np.array([1, 7, 5, 4, 6, 3])
data

>>> array([1, 7, 5, 4, 6, 3])

- 동일한 길이의 bins 값으로 분할

pd.cut(data, 3)

>>> [(0.994, 3.0], (5.0, 7.0], (3.0, 5.0], (3.0, 5.0], (5.0, 7.0], (0.994, 3.0]]
Categories (3, interval[float64]): [(0.994, 3.0] < (3.0, 5.0] < (5.0, 7.0]]

- return bins

- bins를 반환할깨 사용합니다. 기본 값은 False 입니다.

pd.cut(data,3,retbins=True)

>>> ([(0.994, 3.0], (5.0, 7.0], (3.0, 5.0], (3.0, 5.0], (5.0, 7.0], (0.994, 3.0]]
 Categories (3, interval[float64]): [(0.994, 3.0] < (3.0, 5.0] < (5.0, 7.0]],
 array([0.994, 3.   , 5.   , 7.   ]))

- 분할 값에 특정 레이블 할당

- parameter => label

- default = None

pd.cut(data, 3, labels=["bad", "medium", "good"])

>>> ['bad', 'good', 'medium', 'medium', 'good', 'bad']
Categories (3, object): ['bad' < 'medium' < 'good']

- 레이블 할당 시 범주 정렬 사용 하지 않음

- parameter => orfered

- default = True

pd.cut(np.array([1, 7, 5, 4, 6, 3]), 3,labels=["B", "A", "B"], ordered=False)

>>> ['B', 'B', 'A', 'A', 'B', 'B']
Categories (2, object): ['A', 'B']

참고

pandas API Reference

pandas.cut(x, bins, right=True, labels=None, retbins=False, precision=3, include_lowest=False, duplicates='raise', ordered=True)

Bin values into discrete intervals.

Use cut when you need to segment and sort data values into bins. This function is also useful for going from a continuous variable to a categorical variable. For example, cut could convert ages to groups of age ranges. Supports binning into an equal number of bins, or a pre-specified array of bins.

728x90

저작자표시 비영리 동일조건 (새창열림)

'programming language > Python' 카테고리의 다른 글

[Python] Colab Selenium - 코랩 셀리니움 사용법 (0)	2020.11.26
[Python] 파이썬 넘파이 (Numpy) - numpy.log1p() / numpy.expm1() (0)	2020.10.22
[Python] 넘파이 (Numpy) - 공부하기_Numpy.c_ .2 (0)	2020.10.20
[Python] 넘파이 (Numpy) - 공부하기_random sampling.1 (0)	2020.10.20
[python] 판다스 날짜 데이터 타입 변환 - pandas / to_datetime() (2)	2020.10.08

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Hiio.com