R TermDocumentMatrix()

Jmnote (토론 | 기여)님의 2019년 12월 8일 (일) 19:17 판 (→‎같이 보기)

1 개요

R TermDocumentMatrix()
  • "Term-Document Matrix"
library("tm")
data("crude")
tdm <- TermDocumentMatrix(crude,
                          control = list(removePunctuation = TRUE,
                                         stopwords = TRUE))
tdm
## <<TermDocumentMatrix (terms: 1000, documents: 20)>>
## Non-/sparse entries: 1738/18262
## Sparsity           : 91%
## Maximal term length: 16
## Weighting          : term frequency (tf)

inspect(tdm)
## <<TermDocumentMatrix (terms: 1000, documents: 20)>>
## Non-/sparse entries: 1738/18262
## Sparsity           : 91%
## Maximal term length: 16
## Weighting          : term frequency (tf)
## Sample             :
##         Docs
## Terms    144 236 237 242 246 248 273 489 502 704
##   bpd      4   7   0   0   0   2   8   0   0   0
##   crude    0   2   0   0   0   0   5   0   0   0
##   dlrs     0   2   1   0   0   4   2   1   1   0
##   last     1   4   3   0   2   1   7   0   0   0
##   market   3   0   0   2   0   8   1   0   0   2
##   mln      4   4   1   0   0   3   9   3   3   0
##   oil     12   7   3   3   5   9   5   4   5   3
##   opec    13   6   1   2   1   6   5   0   0   0
##   prices   5   5   1   2   1   9   5   2   2   3
##   said    11  10   1   3   5   7   8   2   2   4

2 같이 보기

3 참고

문서 댓글 ({{ doc_comments.length }})
{{ comment.name }} {{ comment.created | snstime }}