crescent702.log

crescent702.log

spark ML에서 StringIndexer: handling unseen labels를 보았다면

rupert·2021년 7월 21일

0

handleInvalid를 설정해주면 된다

StringIndexerModel.from_labels(labels,inputCol=categoricalCol, outputCol=categoricalCol + 'Index',handleInvalid="keep")

종류

'error': throws an exception (which is the default)
'skip': skips the rows containing the unseen labels entirely (removes the rows on the output!)
'keep': puts unseen labels in a special additional bucket, at index numLabels

Ref

https://stackoverflow.com/questions/34681534/spark-ml-stringindexer-handling-unseen-labels

hi there

이전 포스트

저평가섹터, 종목 선정실험(21년02월)

다음 포스트

[hadoop] missing block의 이해, 발생원인

0개의 댓글

관련 채용 정보