본문 바로가기
Data Science/R

[R] 시계열 분석을 통한 미래 예측

by AI_Wooah 2022. 3. 1.

https://polar-comet-18d.notion.site/R-a3e88edc781249d78609c316ee6b6583

 

[R] 시계열 분석을 통한 미래 예측

미래 예측의 중요성

polar-comet-18d.notion.site

미래 예측의 중요성

시간에 따른 변화를 이해하는 것은 비즈니스에서 성공을 좌우하는 가장 중요한 요소이며 마케팅과 비즈니스전략 수립에 많이 쓰이고 있다.

시계열 분석이란?

시계열 분석은 날짜 데이터를 사용해 일/월/연 단위로 수치를 예측하거나 이상치를 모니터링 하는데 사용하는 방법

시계열 분석을 통해 시즌/주기/트렌드 등의 패턴을 도출하는 것이 가능하다.

시계열 분석의 절차는

날짜 데이터 → 시계열 분석 → 수치 예측

시계열 분석을 통해 미래에 발생할 수치를 사전에 파악해 급격한 증가 또는 하락에 대처할 수 있다.

시계열 분석 종류

  • 회귀분석 방법
  • 지수평활, 분해방법 등

기초 통계 계산

## R의 기초통계 계산

# 1) 데이터 불러오기기
iris
mtcars

 # 2) 다양한 통계값 산출
# 평균, 중간값, 표준편차 구하기
m1<-mean(iris$Sepal.Length)
m1
m2<-median(iris$Sepal.Length)
m2
s1<-sd(iris$Sepal.Length)
s1

# 3) 표준화 수행
# 평균을 뺀 값에 표준편차를 나눠준다
# 표준화 된 Length_z
iris$Sepal.Length_z<-(iris$Sepal.Length-m1)/s1
head(iris)

# 4) 상관분석
# iris 데이터에 있는 연속형 변수 각각의 조합과 상관관계를 파악한다
# 일반적으로 많이 사용되는 pearson상관계수로 계산한다.
# cor(iris,method=c("pearson"))
cor(iris[,c(1:4)],method=c("pearson"))

library(PerformanceAnalytics)
chart.Correlation(iris[,c(1:4)],pch=19)

# 5) t 검정 
# 평균비교할때 사용한다
# Sepal.Length의 평균이 동등하다.
iris_test<-subset(iris,Species=="setosa" | Species =="virginica")
boxplot(Sepal.Length~Species,data=iris_test)
t.test(iris_test$Sepal.Length~iris_test$Species,var.equal=T)

# 6) 카이제곱 검정
table(mtcars$vs,mtcars$cyl)
chisq.test(mtcars$vs,mtcars$cyl)

시계열 분석 방법

## 시계열 분석 예측 

# 1) 데이터 불러오기

AirPassengers

# 2) 시계열 데이터 시각화
ts.plot(AirPassengers)
title("1949~1960년 월별 탑승 승객")

# 3) 시계열 분해
# 데이터 쪼개기
ts<-decompose(AirPassengers)
ts
# 기존 데이터
# 장기 트렌드
# 시즌 패턴
# 랜덤
plot(ts)

# 4) 시계열 분석 수행
fit <- arima(AirPassengers, order=c(1,0,0), list(order=c(2,1,0), period=12))
fit

# 5) 시계열 예측 수행 
predict <- predict(fit, n.ahead=24)
predict

# 6)  visual
ts.plot(AirPassengers, predict$pred, col=c(1,2,4,4), lty = c(1,1,2,2))
legend("topleft", c("실제", "예측"), col=c(1,2), lty=c(1,1))
  • result
  • 더보기
    grades grades 2group, group, 2halt halt 2hit hit 2hope hope 2however, however, 2humanistic humanistic 2impact impact 2import import 2imports imports 2inc. inc. 2increased increased 2increasing increasing 2indonesia, indonesia, 2industry. industry. 2instead instead 2institute institute 2intermediate intermediate 2january, january, 2keep keep 2kuwait's kuwait's 2late late 2light light 2limit limit 2line line 2lost lost 2louisiana louisiana 2lowest lowest 2major major 2market. market. 2markets. markets. 2mckiernan mckiernan 2meeting." meeting." 2mid mid 2ministers ministers 2mitigate mitigate 2mizrahi mizrahi 2mln, mln, 2month, month, 2months months 2moves moves 2named named 2nearing nearing 2net net 2neutral neutral 2nuclear nuclear 2officials officials 2oil, oil, 2oil. oil. 2one, one, 2opec's opec's 2opec"s opec"s 2organisation. organisation. 2organization organization 2our our 2overseas overseas 2pact pact 2pay pay 2pct. pct. 2petroliferos petroliferos 2planned planned 2port port 2positive positive 2postings postings 2predicted predicted 2press press 2pressure pressure 2pricing pricing 2private private 2probably probably 2producer producer 2pronounced pronounced 2public public 2put put 2quota. quota. 2quotes quotes 2raise raise 2rate rate 2reduction reduction 2referring referring 2remarks remarks 2return return 2revenues revenues 2review review 2rise rise 2rise. rise. 2risks risks 2riyal riyal 2riyals. riyals. 2said, said, 2said: said: 2says. says. 2sector sector 2self-imposed self-imposed 2selling selling 2shortfall shortfall 2should should 2spokeswoman. spokeswoman. 2spot spot 2steady steady 2stick stick 2strongly strongly 2studies, studies, 2support support 2sweet sweet 2take take 2taken taken 2techniques. techniques. 2them them 2then then 2those those 2three three 2throughput throughput 2today, today, 2today. today. 2trading trading 2transaction transaction 2trust trust 2trying trying 2uncertainty uncertainty 2union union 2until until 2value value 2wam wam 2wanted wanted 2weakness weakness 2weeks weeks 2what what 2who who 2winter winter 2yacimientos yacimientos 2yanbu yanbu 2years years 2yesterday yesterday 2yesterday. yesterday. 2york york 2zone zone 2"(it) "(it) 1"demand "demand 1"expansion "expansion 1"for "for 1"growth "growth 1"if "if 1"may "may 1"opec's "opec's 1"our "our 1"there "there 1"they "they 1"will "will 1(bpd). (bpd). 1 [ reached 'max' / getOption("max.print") -- omitted 766 rows ]> tmError: object 'tm' not found> crideError: object 'cride' not found> crude<>Metadata: corpus specific: 0, document level (indexed): 0Content: documents: 20> tdm<>Non-/sparse entries: 2255/23065Sparsity : 91%Maximal term length: 17Weighting : term frequency (tf)> # 3) 단어 탐색> inspect(tdm)<>Non-/sparse entries: 2255/23065Sparsity : 91%Maximal term length: 17Weighting : term frequency (tf)Sample : DocsTerms 144 236 237 242 246 248 273 489 502 704 and 9 7 11 3 9 6 5 5 6 5 for 5 4 3 1 6 2 4 4 5 3 its 6 8 3 0 3 2 0 2 2 1 mln 4 4 1 0 0 3 9 2 2 0 oil 11 7 3 3 4 9 5 4 4 3 opec 10 6 1 2 1 6 5 0 0 0 prices 3 2 0 1 0 7 4 2 2 2 said 9 6 0 3 4 5 5 2 2 3 that 10 4 1 0 2 2 0 1 1 3 the 17 15 30 6 18 27 21 8 13 21> # 4) 10회 이상 존재하는 단어만 출력> findFreqTerms(tdm,lowfreq=10) [1] "about" "and" "are" "bpd" "but" "crude" "dlrs" "for" [9] "from" "government" "has" "its" "kuwait" "last" "market" "mln" [17] "new" "not" "official" "oil" "one" "opec" "pct" "price" [25] "prices" "reuter" "said" "said." "saudi" "sheikh" "that" "the" [33] "they" "u.s." "was" "were" "will" "with" "would" > # 1) 패키지 불러오기> library(rJava)Error in library(rJava) : there is no package called ‘rJava’> install.packages("rJava")trying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/rJava_1.0-6.tgz'Content> type 'application/x-gzip' length 1117163 bytes (1.1 MB)==================================================downloaded 1.1 MBThe downloaded binary packages are in /var/folders/7f/v36kfprn6n75t6zv9kbffnz40000gn/T//Rtmp731rdd/downloaded_packagesRestarting R session...> install.packages("NLP")trying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/NLP_0.2-1.tgz'Content> type 'application/x-gzip' length 389630 bytes (380 KB)==================================================downloaded 380 KBThe downloaded binary packages are in /var/folders/7f/v36kfprn6n75t6zv9kbffnz40000gn/T//RtmpJbSSo9/downloaded_packages> install.packages("wordcloud")trying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/wordcloud_2.6.tgz'Content> type 'application/x-gzip' length 240231 bytes (234 KB)==================================================downloaded 234 KBThe downloaded binary packages are in /var/folders/7f/v36kfprn6n75t6zv9kbffnz40000gn/T//RtmpJbSSo9/downloaded_packages> install.packages("plyr")trying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/plyr_1.8.6.tgz'Content> type 'application/x-gzip' length 971470 bytes (948 KB)==================================================downloaded 948 KBThe downloaded binary packages are in /var/folders/7f/v36kfprn6n75t6zv9kbffnz40000gn/T//RtmpJbSSo9/downloaded_packages> install.packages("twitteR")Warning in install.packages : dependency ‘rjson’ is not availablealso installing the dependency ‘DBI’trying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/DBI_1.1.2.tgz'Content> type 'application/x-gzip' length 667638 bytes (651 KB)==================================================downloaded 651 KBtrying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/twitteR_1.1.9.tgz'Content> type 'application/x-gzip' length 537986 bytes (525 KB)==================================================downloaded 525 KBThe downloaded binary packages are in /var/folders/7f/v36kfprn6n75t6zv9kbffnz40000gn/T//RtmpJbSSo9/downloaded_packages> install.packages("RColorBrewer")trying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/RColorBrewer_1.1-2.tgz'Content> type 'application/x-gzip' length 53161 bytes (51 KB)==================================================downloaded 51 KBThe downloaded binary packages are in /var/folders/7f/v36kfprn6n75t6zv9kbffnz40000gn/T//RtmpJbSSo9/downloaded_packages> install.packages("ROAuth")also installing the dependencies ‘bitops’, ‘RCurl’trying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/bitops_1.0-7.tgz'Content> type 'application/x-gzip' length 29283 bytes (28 KB)==================================================downloaded 28 KBtrying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/RCurl_1.98-1.6.tgz'Content> type 'application/x-gzip' length 1031584 bytes (1007 KB)==================================================downloaded 1007 KBtrying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/ROAuth_0.9.6.tgz'Content> type 'application/x-gzip' length 66860 bytes (65 KB)==================================================downloaded 65 KBThe downloaded binary packages are in /var/folders/7f/v36kfprn6n75t6zv9kbffnz40000gn/T//RtmpJbSSo9/downloaded_packages> # 1) 패키지 불러오기> library(rJava)> library(KoNLP)Error in library(KoNLP) : there is no package called ‘KoNLP’> library(wordcloud)Loading required package: RColorBrewer> library(RColorBrewer)> library(wordcloud)> library(NLP)> library(wordcloud)> library(twitteR)Error: package or namespace load failed for ‘twitteR’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): there is no package called ‘rjson’> library(plyr)> library(twitteR)Error: package or namespace load failed for ‘twitteR’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): there is no package called ‘rjson’> install.packages("RJSONIO")trying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/RJSONIO_1.3-1.6.tgz'Content> type 'application/x-gzip' length 1368906 bytes (1.3 MB)==================================================downloaded 1.3 MBThe downloaded binary packages are in /var/folders/7f/v36kfprn6n75t6zv9kbffnz40000gn/T//RtmpJbSSo9/downloaded_packages> library(twitteR)Error: package or namespace load failed for ‘twitteR’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): there is no package called ‘rjson’> library(tm)> library(ROAuth)> library(ggplot2) Attaching package: ‘ggplot2’The following object is masked from ‘package:NLP’: annotate> # 2)트위터 계정 접속> api_key <- '입력필요'> library(ggplot2) > library(RJSONIO)> library(twitteR)Error: package or namespace load failed for ‘twitteR’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): there is no package called ‘rjson’> # 2)트위터 계정 접속> api_key <- '입력필요'> api_secret <- '입력필요'> access_token <- '입력필요'> access_token_secret <- '입력필요'> setup_twitter_oauth(api_key,api_secret,access_token,access_token_secret)Error in setup_twitter_oauth(api_key, api_secret, access_token, access_token_secret) : could not find function "setup_twitter_oauth"> # 3) 트럼프 관련 트윗 수집> keyword <- enc2utf8("#트럼프")> result <- searchTwitter(keyword, since='2016-10-01',lang="ko",n=10000)Error in searchTwitter(keyword, since = "2016-10-01", lang = "ko", n = 10000) : could not find function "searchTwitter"> # 4) 문자에 해당하는 부분만 추출> result.df <- twListToDF(result)Error in twListToDF(result) : could not find function "twListToDF"> result.text <- result.df$textError: object 'result.df' not found> # 5) 정제 작업> result.text <- gsub("\\n", "", result.text)Error in gsub("\\n", "", result.text) : object 'result.text' not found> result.text <- gsub("\\r", "", result.text)Error in gsub("\\r", "", result.text) : object 'result.text' not found> result.text <- gsub("RT", "", result.text)Error in gsub("RT", "", result.text) : object 'result.text' not found> result.text <- gsub("http", "", result.text)Error in gsub("http", "", result.text) : object 'result.text' not found> result.text <- gsub("CO", "", result.text)Error in gsub("CO", "", result.text) : object 'result.text' not found> result.text <- gsub("co", "", result.text)Error in gsub("co", "", result.text) : object 'result.text' not found> result.text <- gsub("ㅋㅋ", "", result.text)Error in gsub("ㅋㅋ", "", result.text) : object 'result.text' not found> result.text <- gsub("ㅋㅋㅋ", "", result.text)Error in gsub("ㅋㅋㅋ", "", result.text) : object 'result.text' not found> result.text <- gsub("ㅋㅋㅋㅋ", "", result.text)Error in gsub("ㅋㅋㅋㅋ", "", result.text) : object 'result.text' not found> result.text <- gsub("ㅠㅠ", "", result.text)Error in gsub("ㅠㅠ", "", result.text) : object 'result.text' not found> result_nouns <- Map(extractNoun, result.text)Error in match.fun(f) : object 'extractNoun' not found> result_wordsvec <- unlist(result_nouns, use.name=F)Error in unlist(result_nouns, use.name = F) : object 'result_nouns' not found> result_wordsvec <- result_wordsvec[-which(result_wordsvec %in% stopwords("english"))]Error: object 'result_wordsvec' not found> result_wordsvec <- gsub("[[:punct:]]","", result_wordsvec)Error in gsub("[[:punct:]]", "", result_wordsvec) : object 'result_wordsvec' not found> result_wordsvec <- Filter(function(x){nchar(x)>=2}, result_wordsvec)Error in lapply(x, f) : object 'result_wordsvec' not found> # 문자 카운팅> result_wordcount <- table(result_wordsvec)Error in table(result_wordsvec) : object 'result_wordsvec' not found> 트위터API설정절차 <- read.table("~/DataAnalytics/R/R_실습예제/lesson_7/data_file/트위터API설정절차.txt", header=TRUE, quote="\\"")Error in read.table("~/DataAnalytics/R/R_실습예제/lesson_7/data_file/트위터API설정절차.txt", : more columns than column namesIn addition: Warning messages:1: In grep("^[^#].*", lines, value = TRUE) : input string 4 is invalid in this locale2: In grep("^[^#].*", lines, value = TRUE) : input string 5 is invalid in this locale3: In grep("^[^#].*", lines, value = TRUE) : input string 6 is invalid in this locale4: In grep("^[^#].*", lines, value = TRUE) : input string 7 is invalid in this locale5: In grep("^[^#].*", lines, value = TRUE) : input string 8 is invalid in this locale> View(트위터API설정절차)Error in View : object '트위터API설정절차' not found> # 1) 데이터 불러오기기> iris Sepal.Length Sepal.Width Petal.Length Petal.Width Species1 5.1 3.5 1.4 0.2 setosa2 4.9 3.0 1.4 0.2 setosa3 4.7 3.2 1.3 0.2 setosa4 4.6 3.1 1.5 0.2 setosa5 5.0 3.6 1.4 0.2 setosa6 5.4 3.9 1.7 0.4 setosa7 4.6 3.4 1.4 0.3 setosa8 5.0 3.4 1.5 0.2 setosa9 4.4 2.9 1.4 0.2 setosa10 4.9 3.1 1.5 0.1 setosa11 5.4 3.7 1.5 0.2 setosa12 4.8 3.4 1.6 0.2 setosa13 4.8 3.0 1.4 0.1 setosa14 4.3 3.0 1.1 0.1 setosa15 5.8 4.0 1.2 0.2 setosa16 5.7 4.4 1.5 0.4 setosa17 5.4 3.9 1.3 0.4 setosa18 5.1 3.5 1.4 0.3 setosa19 5.7 3.8 1.7 0.3 setosa20 5.1 3.8 1.5 0.3 setosa21 5.4 3.4 1.7 0.2 setosa22 5.1 3.7 1.5 0.4 setosa23 4.6 3.6 1.0 0.2 setosa24 5.1 3.3 1.7 0.5 setosa25 4.8 3.4 1.9 0.2 setosa26 5.0 3.0 1.6 0.2 setosa27 5.0 3.4 1.6 0.4 setosa28 5.2 3.5 1.5 0.2 setosa29 5.2 3.4 1.4 0.2 setosa30 4.7 3.2 1.6 0.2 setosa31 4.8 3.1 1.6 0.2 setosa32 5.4 3.4 1.5 0.4 setosa33 5.2 4.1 1.5 0.1 setosa34 5.5 4.2 1.4 0.2 setosa35 4.9 3.1 1.5 0.2 setosa36 5.0 3.2 1.2 0.2 setosa37 5.5 3.5 1.3 0.2 setosa38 4.9 3.6 1.4 0.1 setosa39 4.4 3.0 1.3 0.2 setosa40 5.1 3.4 1.5 0.2 setosa41 5.0 3.5 1.3 0.3 setosa42 4.5 2.3 1.3 0.3 setosa43 4.4 3.2 1.3 0.2 setosa44 5.0 3.5 1.6 0.6 setosa45 5.1 3.8 1.9 0.4 setosa46 4.8 3.0 1.4 0.3 setosa47 5.1 3.8 1.6 0.2 setosa48 4.6 3.2 1.4 0.2 setosa49 5.3 3.7 1.5 0.2 setosa50 5.0 3.3 1.4 0.2 setosa51 7.0 3.2 4.7 1.4 versicolor52 6.4 3.2 4.5 1.5 versicolor53 6.9 3.1 4.9 1.5 versicolor54 5.5 2.3 4.0 1.3 versicolor55 6.5 2.8 4.6 1.5 versicolor56 5.7 2.8 4.5 1.3 versicolor57 6.3 3.3 4.7 1.6 versicolor58 4.9 2.4 3.3 1.0 versicolor59 6.6 2.9 4.6 1.3 versicolor60 5.2 2.7 3.9 1.4 versicolor61 5.0 2.0 3.5 1.0 versicolor62 5.9 3.0 4.2 1.5 versicolor63 6.0 2.2 4.0 1.0 versicolor64 6.1 2.9 4.7 1.4 versicolor65 5.6 2.9 3.6 1.3 versicolor66 6.7 3.1 4.4 1.4 versicolor67 5.6 3.0 4.5 1.5 versicolor68 5.8 2.7 4.1 1.0 versicolor69 6.2 2.2 4.5 1.5 versicolor70 5.6 2.5 3.9 1.1 versicolor71 5.9 3.2 4.8 1.8 versicolor72 6.1 2.8 4.0 1.3 versicolor73 6.3 2.5 4.9 1.5 versicolor74 6.1 2.8 4.7 1.2 versicolor75 6.4 2.9 4.3 1.3 versicolor76 6.6 3.0 4.4 1.4 versicolor77 6.8 2.8 4.8 1.4 versicolor78 6.7 3.0 5.0 1.7 versicolor79 6.0 2.9 4.5 1.5 versicolor80 5.7 2.6 3.5 1.0 versicolor81 5.5 2.4 3.8 1.1 versicolor82 5.5 2.4 3.7 1.0 versicolor83 5.8 2.7 3.9 1.2 versicolor84 6.0 2.7 5.1 1.6 versicolor85 5.4 3.0 4.5 1.5 versicolor86 6.0 3.4 4.5 1.6 versicolor87 6.7 3.1 4.7 1.5 versicolor88 6.3 2.3 4.4 1.3 versicolor89 5.6 3.0 4.1 1.3 versicolor90 5.5 2.5 4.0 1.3 versicolor91 5.5 2.6 4.4 1.2 versicolor92 6.1 3.0 4.6 1.4 versicolor93 5.8 2.6 4.0 1.2 versicolor94 5.0 2.3 3.3 1.0 versicolor95 5.6 2.7 4.2 1.3 versicolor96 5.7 3.0 4.2 1.2 versicolor97 5.7 2.9 4.2 1.3 versicolor98 6.2 2.9 4.3 1.3 versicolor99 5.1 2.5 3.0 1.1 versicolor100 5.7 2.8 4.1 1.3 versicolor101 6.3 3.3 6.0 2.5 virginica102 5.8 2.7 5.1 1.9 virginica103 7.1 3.0 5.9 2.1 virginica104 6.3 2.9 5.6 1.8 virginica105 6.5 3.0 5.8 2.2 virginica106 7.6 3.0 6.6 2.1 virginica107 4.9 2.5 4.5 1.7 virginica108 7.3 2.9 6.3 1.8 virginica109 6.7 2.5 5.8 1.8 virginica110 7.2 3.6 6.1 2.5 virginica111 6.5 3.2 5.1 2.0 virginica112 6.4 2.7 5.3 1.9 virginica113 6.8 3.0 5.5 2.1 virginica114 5.7 2.5 5.0 2.0 virginica115 5.8 2.8 5.1 2.4 virginica116 6.4 3.2 5.3 2.3 virginica117 6.5 3.0 5.5 1.8 virginica118 7.7 3.8 6.7 2.2 virginica119 7.7 2.6 6.9 2.3 virginica120 6.0 2.2 5.0 1.5 virginica121 6.9 3.2 5.7 2.3 virginica122 5.6 2.8 4.9 2.0 virginica123 7.7 2.8 6.7 2.0 virginica124 6.3 2.7 4.9 1.8 virginica125 6.7 3.3 5.7 2.1 virginica126 7.2 3.2 6.0 1.8 virginica127 6.2 2.8 4.8 1.8 virginica128 6.1 3.0 4.9 1.8 virginica129 6.4 2.8 5.6 2.1 virginica130 7.2 3.0 5.8 1.6 virginica131 7.4 2.8 6.1 1.9 virginica132 7.9 3.8 6.4 2.0 virginica133 6.4 2.8 5.6 2.2 virginica134 6.3 2.8 5.1 1.5 virginica135 6.1 2.6 5.6 1.4 virginica136 7.7 3.0 6.1 2.3 virginica137 6.3 3.4 5.6 2.4 virginica138 6.4 3.1 5.5 1.8 virginica139 6.0 3.0 4.8 1.8 virginica140 6.9 3.1 5.4 2.1 virginica141 6.7 3.1 5.6 2.4 virginica142 6.9 3.1 5.1 2.3 virginica143 5.8 2.7 5.1 1.9 virginica144 6.8 3.2 5.9 2.3 virginica145 6.7 3.3 5.7 2.5 virginica146 6.7 3.0 5.2 2.3 virginica147 6.3 2.5 5.0 1.9 virginica148 6.5 3.0 5.2 2.0 virginica149 6.2 3.4 5.4 2.3 virginica150 5.9 3.0 5.1 1.8 virginica> mtcars mpg cyl disp hp drat wt qsec vs am gear carbMazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2> # 2) 다양한 통계값 산출> # 평균, 중간값, 표준편차 구하기> m1<-mean(iris$Sepal.Length)> m1[1] 5.843333> m2<-median(iris$Sepal.Length)> m2[1] 5.8> s1Error: object 's1' not found> s1<-sd(iris$Sepal.Length)> s1[1] 0.8280661> iris$Sepal.Length_z<-(iris$Sepal.Length-m1)/s1> head(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.Length_z1 5.1 3.5 1.4 0.2 setosa -0.89767392 4.9 3.0 1.4 0.2 setosa -1.13920053 4.7 3.2 1.3 0.2 setosa -1.38072714 4.6 3.1 1.5 0.2 setosa -1.50149045 5.0 3.6 1.4 0.2 setosa -1.01843726 5.4 3.9 1.7 0.4 setosa -0.5353840> # 4) 상관분석> # iris 데이터에 있는 연속형 변수 각각의 조합과 상관관계를 파악한다> # 일반적으로 많이 사용되는 pearson상관계수로 계산한다.> cor(iris,method=c("pearson"))Error in cor(iris, method = c("pearson")) : 'x' must be numeric> # 4) 상관분석> # iris 데이터에 있는 연속형 변수 각각의 조합과 상관관계를 파악한다> # 일반적으로 많이 사용되는 pearson상관계수로 계산한다.> # cor(iris,method=c("pearson"))> cor(iris[,c(1:4)],method=c("pearson")) Sepal.Length Sepal.Width Petal.Length Petal.WidthSepal.Length 1.0000000 -0.1175698 0.8717538 0.8179411Sepal.Width -0.1175698 1.0000000 -0.4284401 -0.3661259Petal.Length 0.8717538 -0.4284401 1.0000000 0.9628654Petal.Width 0.8179411 -0.3661259 0.9628654 1.0000000> library(PerformanceAnalytics)Error in library(PerformanceAnalytics) : there is no package called ‘PerformanceAnalytics’> install.packages("PerformanceAnalytics")also installing the dependencies ‘xts’, ‘quadprog’trying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/xts_0.12.1.tgz'Content> type 'application/x-gzip' length 930515 bytes (908 KB)==================================================downloaded 908 KBtrying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/quadprog_1.5-8.tgz'Content> type 'application/x-gzip' length 39295 bytes (38 KB)==================================================downloaded 38 KBtrying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/PerformanceAnalytics_2.0.4.tgz'Content> type 'application/x-gzip' length 3161270 bytes (3.0 MB)==================================================downloaded 3.0 MBThe downloaded binary packages are in /var/folders/7f/v36kfprn6n75t6zv9kbffnz40000gn/T//RtmpJbSSo9/downloaded_packages> library(PerformanceAnalytics)Loading required package: xtsLoading required package: zooAttaching package: ‘zoo’The following objects are masked from ‘package:base’: as.Date, as.Date.numericAttaching package: ‘PerformanceAnalytics’The following object is masked from ‘package:wordcloud’: textplotThe following object is masked from ‘package:graphics’: legend> library(PerformanceAnalytics)> chart.Correlation(iris[,c(1:4)],pch=19)> # 5) t 검정 > iris_test<-subset(iris,Species=="setosa" | Species =="virginica")> library(PerformanceAnalytics)> chart.Correlation(iris[,c(1:4)],pch=19)> # 5) t 검정 > # 평균비교할때 사용한다> iris_test<-subset(iris,Species=="setosa" | Species =="virginica")> boxplot(Sepal.Length~Species,data=iris_test)> t.test(iris_test$Sepal.Length~iris_test$Species,var.equal=T) Two Sample t-testdata: iris_test$Sepal.Length by iris_test$Speciest = -15.386, df = 98, p-value < 2.2e-16alternative hypothesis: true difference in means is not equal to 095 percent confidence interval: -1.786042 -1.377958sample estimates: mean in group setosa mean in group virginica 5.006 6.588 > # 6) 카이제곱 검정> table(mtcars$vs,mtcars$cyl) 4 6 8 0 1 3 14 1 10 4 0> chisq.test(mtcars$vs,mtcars$cyl) Pearson's Chi-squared testdata: mtcars$vs and mtcars$cylX-squared = 21.34, df = 2, p-value = 2.323e-05Warning message:In chisq.test(mtcars$vs, mtcars$cyl) : Chi-squared approximation may be incorrect> AirPassengers Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec1949 112 118 132 129 121 135 148 148 136 119 104 1181950 115 126 141 135 125 149 170 170 158 133 114 1401951 145 150 178 163 172 178 199 199 184 162 146 1661952 171 180 193 181 183 218 230 242 209 191 172 1941953 196 196 236 235 229 243 264 272 237 211 180 2011954 204 188 235 227 234 264 302 293 259 229 203 2291955 242 233 267 269 270 315 364 347 312 274 237 2781956 284 277 317 313 318 374 413 405 355 306 271 3061957 315 301 356 348 355 422 465 467 404 347 305 3361958 340 318 362 348 363 435 491 505 404 359 310 3371959 360 342 406 396 420 472 548 559 463 407 362 4051960 417 391 419 461 472 535 622 606 508 461 390 432> AirPassengers Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec1949 112 118 132 129 121 135 148 148 136 119 104 1181950 115 126 141 135 125 149 170 170 158 133 114 1401951 145 150 178 163 172 178 199 199 184 162 146 1661952 171 180 193 181 183 218 230 242 209 191 172 1941953 196 196 236 235 229 243 264 272 237 211 180 2011954 204 188 235 227 234 264 302 293 259 229 203 2291955 242 233 267 269 270 315 364 347 312 274 237 2781956 284 277 317 313 318 374 413 405 355 306 271 3061957 315 301 356 348 355 422 465 467 404 347 305 3361958 340 318 362 348 363 435 491 505 404 359 310 3371959 360 342 406 396 420 472 548 559 463 407 362 4051960 417 391 419 461 472 535 622 606 508 461 390 432> # 2) 시계열 데이터 시각화> ts.plot(AirPassengers)> title("1949~1960년 월별 탑승 승객")> # 3) 시계열 분해> # 데이터 쪼개기> ts<-decompose(AirPassengers)> ts$x Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec1949 112 118 132 129 121 135 148 148 136 119 104 1181950 115 126 141 135 125 149 170 170 158 133 114 1401951 145 150 178 163 172 178 199 199 184 162 146 1661952 171 180 193 181 183 218 230 242 209 191 172 1941953 196 196 236 235 229 243 264 272 237 211 180 2011954 204 188 235 227 234 264 302 293 259 229 203 2291955 242 233 267 269 270 315 364 347 312 274 237 2781956 284 277 317 313 318 374 413 405 355 306 271 3061957 315 301 356 348 355 422 465 467 404 347 305 3361958 340 318 362 348 363 435 491 505 404 359 310 3371959 360 342 406 396 420 472 548 559 463 407 362 4051960 417 391 419 461 472 535 622 606 508 461 390 432$seasonal Jan Feb Mar Apr May Jun Jul Aug Sep1949 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021950 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021951 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021952 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021953 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021954 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021955 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021956 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021957 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021958 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021959 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021960 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.520202 Oct Nov Dec1949 -20.642677 -53.593434 -28.6199491950 -20.642677 -53.593434 -28.6199491951 -20.642677 -53.593434 -28.6199491952 -20.642677 -53.593434 -28.6199491953 -20.642677 -53.593434 -28.6199491954 -20.642677 -53.593434 -28.6199491955 -20.642677 -53.593434 -28.6199491956 -20.642677 -53.593434 -28.6199491957 -20.642677 -53.593434 -28.6199491958 -20.642677 -53.593434 -28.6199491959 -20.642677 -53.593434 -28.6199491960 -20.642677 -53.593434 -28.619949$trend Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov1949 NA NA NA NA NA NA 126.7917 127.2500 127.9583 128.5833 129.00001950 131.2500 133.0833 134.9167 136.4167 137.4167 138.7500 140.9167 143.1667 145.7083 148.4167 151.54171951 157.1250 159.5417 161.8333 164.1250 166.6667 169.0833 171.2500 173.5833 175.4583 176.8333 178.04171952 183.1250 186.2083 189.0417 191.2917 193.5833 195.8333 198.0417 199.7500 202.2083 206.2500 210.41671953 215.8333 218.5000 220.9167 222.9167 224.0833 224.7083 225.3333 225.3333 224.9583 224.5833 224.45831954 228.0000 230.4583 232.2500 233.9167 235.6250 237.7500 240.5000 243.9583 247.1667 250.2500 253.50001955 261.8333 266.6667 271.1250 275.2083 278.5000 281.9583 285.7500 289.3333 293.2500 297.1667 301.00001956 309.9583 314.4167 318.6250 321.7500 324.5000 327.0833 329.5417 331.8333 334.4583 337.5417 340.54171957 348.2500 353.0000 357.6250 361.3750 364.5000 367.1667 369.4583 371.2083 372.1667 372.4167 372.75001958 375.2500 377.9167 379.5000 380.0000 380.7083 380.9583 381.8333 383.6667 386.5000 390.3333 394.70831959 402.5417 407.1667 411.8750 416.3333 420.5000 425.5000 430.7083 435.1250 437.7083 440.9583 445.83331960 456.3333 461.3750 465.2083 469.3333 472.7500 475.0417 NA NA NA NA NA Dec1949 129.75001950 154.70831951 180.16671952 213.37501953 225.54171954 257.12501955 305.45831956 344.08331957 373.62501958 398.62501959 450.62501960 NA$random Jan Feb Mar Apr May Jun Jul Aug1949 NA NA NA NA NA NA -42.6224747 -42.07323231950 8.4987374 29.1047980 8.3244949 6.6199495 -7.9103535 -25.1527778 -34.7474747 -35.98989901951 12.6237374 26.6464646 18.4078283 6.9116162 9.8396465 -26.4861111 -36.0808081 -37.40656571952 12.6237374 29.9797980 6.1994949 -2.2550505 -6.0770202 -13.2361111 -31.8724747 -20.57323231953 4.9154040 13.6881313 17.3244949 20.1199495 9.4229798 -17.1111111 -25.1641414 -16.15656571954 0.7487374 -6.2702020 4.9911616 1.1199495 2.8813131 -9.1527778 -2.3308081 -13.78156571955 4.9154040 2.5214646 -1.8838384 1.8282828 -3.9936869 -2.3611111 14.4191919 -5.15656571956 -1.2095960 -1.2285354 0.6161616 -0.7133838 -1.9936869 11.5138889 19.6275253 10.34343431957 -8.5012626 -15.8118687 0.6161616 -5.3383838 -4.9936869 19.4305556 31.7108586 32.96843431958 -10.5012626 -23.7285354 -15.2588384 -23.9633838 -13.2020202 18.6388889 45.3358586 58.51010101959 -17.7929293 -28.9785354 -3.6338384 -12.2967172 4.0063131 11.0972222 53.4608586 61.05176771960 -14.5845960 -34.1868687 -43.9671717 -0.2967172 3.7563131 24.5555556 NA NA Sep Oct Nov Dec1949 -8.4785354 11.0593434 28.5934343 16.86994951950 -4.2285354 5.2260101 16.0517677 13.91161621951 -7.9785354 5.8093434 21.5517677 14.45328281952 -9.7285354 5.3926768 15.1767677 9.24494951953 -4.4785354 7.0593434 9.1351010 4.07828281954 -4.6868687 -0.6073232 3.0934343 0.49494951955 2.2297980 -2.5239899 -10.4065657 1.16161621956 4.0214646 -10.8989899 -15.9482323 -9.46338381957 15.3131313 -4.7739899 -14.1565657 -9.00505051958 0.9797980 -10.6906566 -31.1148990 -33.00505051959 8.7714646 -13.3156566 -30.2398990 -17.00505051960 NA NA NA NA$figure [1] -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.520202[10] -20.642677 -53.593434 -28.619949$type[1] "additive"attr(,"class")[1] "decomposed.ts"> plot(ts)> ts$x Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec1949 112 118 132 129 121 135 148 148 136 119 104 1181950 115 126 141 135 125 149 170 170 158 133 114 1401951 145 150 178 163 172 178 199 199 184 162 146 1661952 171 180 193 181 183 218 230 242 209 191 172 1941953 196 196 236 235 229 243 264 272 237 211 180 2011954 204 188 235 227 234 264 302 293 259 229 203 2291955 242 233 267 269 270 315 364 347 312 274 237 2781956 284 277 317 313 318 374 413 405 355 306 271 3061957 315 301 356 348 355 422 465 467 404 347 305 3361958 340 318 362 348 363 435 491 505 404 359 310 3371959 360 342 406 396 420 472 548 559 463 407 362 4051960 417 391 419 461 472 535 622 606 508 461 390 432$seasonal Jan Feb Mar Apr May Jun Jul Aug Sep1949 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021950 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021951 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021952 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021953 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021954 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021955 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021956 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021957 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021958 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021959 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021960 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.520202 Oct Nov Dec1949 -20.642677 -53.593434 -28.6199491950 -20.642677 -53.593434 -28.6199491951 -20.642677 -53.593434 -28.6199491952 -20.642677 -53.593434 -28.6199491953 -20.642677 -53.593434 -28.6199491954 -20.642677 -53.593434 -28.6199491955 -20.642677 -53.593434 -28.6199491956 -20.642677 -53.593434 -28.6199491957 -20.642677 -53.593434 -28.6199491958 -20.642677 -53.593434 -28.6199491959 -20.642677 -53.593434 -28.6199491960 -20.642677 -53.593434 -28.619949$trend Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov1949 NA NA NA NA NA NA 126.7917 127.2500 127.9583 128.5833 129.00001950 131.2500 133.0833 134.9167 136.4167 137.4167 138.7500 140.9167 143.1667 145.7083 148.4167 151.54171951 157.1250 159.5417 161.8333 164.1250 166.6667 169.0833 171.2500 173.5833 175.4583 176.8333 178.04171952 183.1250 186.2083 189.0417 191.2917 193.5833 195.8333 198.0417 199.7500 202.2083 206.2500 210.41671953 215.8333 218.5000 220.9167 222.9167 224.0833 224.7083 225.3333 225.3333 224.9583 224.5833 224.45831954 228.0000 230.4583 232.2500 233.9167 235.6250 237.7500 240.5000 243.9583 247.1667 250.2500 253.50001955 261.8333 266.6667 271.1250 275.2083 278.5000 281.9583 285.7500 289.3333 293.2500 297.1667 301.00001956 309.9583 314.4167 318.6250 321.7500 324.5000 327.0833 329.5417 331.8333 334.4583 337.5417 340.54171957 348.2500 353.0000 357.6250 361.3750 364.5000 367.1667 369.4583 371.2083 372.1667 372.4167 372.75001958 375.2500 377.9167 379.5000 380.0000 380.7083 380.9583 381.8333 383.6667 386.5000 390.3333 394.70831959 402.5417 407.1667 411.8750 416.3333 420.5000 425.5000 430.7083 435.1250 437.7083 440.9583 445.83331960 456.3333 461.3750 465.2083 469.3333 472.7500 475.0417 NA NA NA NA NA Dec1949 129.75001950 154.70831951 180.16671952 213.37501953 225.54171954 257.12501955 305.45831956 344.08331957 373.62501958 398.62501959 450.62501960 NA$random Jan Feb Mar Apr May Jun Jul Aug1949 NA NA NA NA NA NA -42.6224747 -42.07323231950 8.4987374 29.1047980 8.3244949 6.6199495 -7.9103535 -25.1527778 -34.7474747 -35.98989901951 12.6237374 26.6464646 18.4078283 6.9116162 9.8396465 -26.4861111 -36.0808081 -37.40656571952 12.6237374 29.9797980 6.1994949 -2.2550505 -6.0770202 -13.2361111 -31.8724747 -20.57323231953 4.9154040 13.6881313 17.3244949 20.1199495 9.4229798 -17.1111111 -25.1641414 -16.15656571954 0.7487374 -6.2702020 4.9911616 1.1199495 2.8813131 -9.1527778 -2.3308081 -13.78156571955 4.9154040 2.5214646 -1.8838384 1.8282828 -3.9936869 -2.3611111 14.4191919 -5.15656571956 -1.2095960 -1.2285354 0.6161616 -0.7133838 -1.9936869 11.5138889 19.6275253 10.34343431957 -8.5012626 -15.8118687 0.6161616 -5.3383838 -4.9936869 19.4305556 31.7108586 32.96843431958 -10.5012626 -23.7285354 -15.2588384 -23.9633838 -13.2020202 18.6388889 45.3358586 58.51010101959 -17.7929293 -28.9785354 -3.6338384 -12.2967172 4.0063131 11.0972222 53.4608586 61.05176771960 -14.5845960 -34.1868687 -43.9671717 -0.2967172 3.7563131 24.5555556 NA NA Sep Oct Nov Dec1949 -8.4785354 11.0593434 28.5934343 16.86994951950 -4.2285354 5.2260101 16.0517677 13.91161621951 -7.9785354 5.8093434 21.5517677 14.45328281952 -9.7285354 5.3926768 15.1767677 9.24494951953 -4.4785354 7.0593434 9.1351010 4.07828281954 -4.6868687 -0.6073232 3.0934343 0.49494951955 2.2297980 -2.5239899 -10.4065657 1.16161621956 4.0214646 -10.8989899 -15.9482323 -9.46338381957 15.3131313 -4.7739899 -14.1565657 -9.00505051958 0.9797980 -10.6906566 -31.1148990 -33.00505051959 8.7714646 -13.3156566 -30.2398990 -17.00505051960 NA NA NA NA$figure [1] -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.520202[10] -20.642677 -53.593434 -28.619949$type[1] "additive"attr(,"class")[1] "decomposed.ts"> # 4) 시계열 분석 수행> fit <- arima(AirPassengers, order=c(1,0,0), list(order=c(2,1,0), period=12))> fitCall:arima(x = AirPassengers, order = c(1, 0, 0), seasonal = list(order = c(2, 1, 0), period = 12))Coefficients: ar1 sar1 sar2 0.9458 -0.1333 0.0821s.e. 0.0284 0.1035 0.1078sigma^2 estimated as 143.1: log likelihood = -516.18, aic = 1040.37> # 5) 시계열 예측 수행 > predict <- predict(fit, n.ahead=24)> predict$pred Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov1961 445.0772 418.6286 451.3255 485.0739 496.9859 555.4025 641.1830 627.2158 528.6446 478.3612 410.03841962 463.4606 435.4701 463.6918 501.9637 511.8873 571.0617 657.1925 640.0611 540.7620 491.0499 419.6633 Dec1961 452.42901962 461.3783$se Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov1961 11.96267 16.46600 19.63824 22.09347 24.07871 25.72521 27.11359 28.29798 29.31703 30.19955 30.967761962 35.68346 38.94721 41.65083 43.92872 45.87078 47.54098 48.98693 50.24524 51.34481 52.30891 53.15659 Dec1961 31.639201962 53.90364> # 6) visual> ts.plot(AirPassengers, predict$pred, col=c(1,2,4,4), lty = c(1,1,2,2))> # 기존 데이터> # 장기 트렌드> # 시즌 패턴> # 랜덤> plot(ts)>

시간에 따른 변화를 이해하는 것은 비즈니스에서 성공을 좌우하는 가장 중요한 요소이며 마케팅과 비즈니스전략 수립에 많이 쓰이고 있다.

시계열 분석이란?

시계열 분석은 날짜 데이터를 사용해 일/월/연 단위로 수치를 예측하거나 이상치를 모니터링 하는데 사용하는 방법

시계열 분석을 통해 시즌/주기/트렌드 등의 패턴을 도출하는 것이 가능하다.

시계열 분석의 절차는

날짜 데이터 → 시계열 분석 → 수치 예측

시계열 분석을 통해 미래에 발생할 수치를 사전에 파악해 급격한 증가 또는 하락에 대처할 수 있다.

시계열 분석 종류

  • 회귀분석 방법
  • 지수평활, 분해방법 등

기초 통계 계산

## R의 기초통계 계산

# 1) 데이터 불러오기기
iris
mtcars

 # 2) 다양한 통계값 산출
# 평균, 중간값, 표준편차 구하기
m1<-mean(iris$Sepal.Length)
m1
m2<-median(iris$Sepal.Length)
m2
s1<-sd(iris$Sepal.Length)
s1

# 3) 표준화 수행
# 평균을 뺀 값에 표준편차를 나눠준다
# 표준화 된 Length_z
iris$Sepal.Length_z<-(iris$Sepal.Length-m1)/s1
head(iris)

# 4) 상관분석
# iris 데이터에 있는 연속형 변수 각각의 조합과 상관관계를 파악한다
# 일반적으로 많이 사용되는 pearson상관계수로 계산한다.
# cor(iris,method=c("pearson"))
cor(iris[,c(1:4)],method=c("pearson"))

library(PerformanceAnalytics)
chart.Correlation(iris[,c(1:4)],pch=19)

# 5) t 검정 
# 평균비교할때 사용한다
# Sepal.Length의 평균이 동등하다.
iris_test<-subset(iris,Species=="setosa" | Species =="virginica")
boxplot(Sepal.Length~Species,data=iris_test)
t.test(iris_test$Sepal.Length~iris_test$Species,var.equal=T)

# 6) 카이제곱 검정
table(mtcars$vs,mtcars$cyl)
chisq.test(mtcars$vs,mtcars$cyl)

시계열 분석 방법

## 시계열 분석 예측 

# 1) 데이터 불러오기

AirPassengers

# 2) 시계열 데이터 시각화
ts.plot(AirPassengers)
title("1949~1960년 월별 탑승 승객")

# 3) 시계열 분해
# 데이터 쪼개기
ts<-decompose(AirPassengers)
ts
# 기존 데이터
# 장기 트렌드
# 시즌 패턴
# 랜덤
plot(ts)

# 4) 시계열 분석 수행
fit <- arima(AirPassengers, order=c(1,0,0), list(order=c(2,1,0), period=12))
fit

# 5) 시계열 예측 수행 
predict <- predict(fit, n.ahead=24)
predict

# 6)  visual
ts.plot(AirPassengers, predict$pred, col=c(1,2,4,4), lty = c(1,1,2,2))
legend("topleft", c("실제", "예측"), col=c(1,2), lty=c(1,1))
  • result
  • 더보기
    grades grades 2group, group, 2halt halt 2hit hit 2hope hope 2however, however, 2humanistic humanistic 2impact impact 2import import 2imports imports 2inc. inc. 2increased increased 2increasing increasing 2indonesia, indonesia, 2industry. industry. 2instead instead 2institute institute 2intermediate intermediate 2january, january, 2keep keep 2kuwait's kuwait's 2late late 2light light 2limit limit 2line line 2lost lost 2louisiana louisiana 2lowest lowest 2major major 2market. market. 2markets. markets. 2mckiernan mckiernan 2meeting." meeting." 2mid mid 2ministers ministers 2mitigate mitigate 2mizrahi mizrahi 2mln, mln, 2month, month, 2months months 2moves moves 2named named 2nearing nearing 2net net 2neutral neutral 2nuclear nuclear 2officials officials 2oil, oil, 2oil. oil. 2one, one, 2opec's opec's 2opec"s opec"s 2organisation. organisation. 2organization organization 2our our 2overseas overseas 2pact pact 2pay pay 2pct. pct. 2petroliferos petroliferos 2planned planned 2port port 2positive positive 2postings postings 2predicted predicted 2press press 2pressure pressure 2pricing pricing 2private private 2probably probably 2producer producer 2pronounced pronounced 2public public 2put put 2quota. quota. 2quotes quotes 2raise raise 2rate rate 2reduction reduction 2referring referring 2remarks remarks 2return return 2revenues revenues 2review review 2rise rise 2rise. rise. 2risks risks 2riyal riyal 2riyals. riyals. 2said, said, 2said: said: 2says. says. 2sector sector 2self-imposed self-imposed 2selling selling 2shortfall shortfall 2should should 2spokeswoman. spokeswoman. 2spot spot 2steady steady 2stick stick 2strongly strongly 2studies, studies, 2support support 2sweet sweet 2take take 2taken taken 2techniques. techniques. 2them them 2then then 2those those 2three three 2throughput throughput 2today, today, 2today. today. 2trading trading 2transaction transaction 2trust trust 2trying trying 2uncertainty uncertainty 2union union 2until until 2value value 2wam wam 2wanted wanted 2weakness weakness 2weeks weeks 2what what 2who who 2winter winter 2yacimientos yacimientos 2yanbu yanbu 2years years 2yesterday yesterday 2yesterday. yesterday. 2york york 2zone zone 2"(it) "(it) 1"demand "demand 1"expansion "expansion 1"for "for 1"growth "growth 1"if "if 1"may "may 1"opec's "opec's 1"our "our 1"there "there 1"they "they 1"will "will 1(bpd). (bpd). 1 [ reached 'max' / getOption("max.print") -- omitted 766 rows ]> tmError: object 'tm' not found> crideError: object 'cride' not found> crude<>Metadata: corpus specific: 0, document level (indexed): 0Content: documents: 20> tdm<>Non-/sparse entries: 2255/23065Sparsity : 91%Maximal term length: 17Weighting : term frequency (tf)> # 3) 단어 탐색> inspect(tdm)<>Non-/sparse entries: 2255/23065Sparsity : 91%Maximal term length: 17Weighting : term frequency (tf)Sample : DocsTerms 144 236 237 242 246 248 273 489 502 704 and 9 7 11 3 9 6 5 5 6 5 for 5 4 3 1 6 2 4 4 5 3 its 6 8 3 0 3 2 0 2 2 1 mln 4 4 1 0 0 3 9 2 2 0 oil 11 7 3 3 4 9 5 4 4 3 opec 10 6 1 2 1 6 5 0 0 0 prices 3 2 0 1 0 7 4 2 2 2 said 9 6 0 3 4 5 5 2 2 3 that 10 4 1 0 2 2 0 1 1 3 the 17 15 30 6 18 27 21 8 13 21> # 4) 10회 이상 존재하는 단어만 출력> findFreqTerms(tdm,lowfreq=10) [1] "about" "and" "are" "bpd" "but" "crude" "dlrs" "for" [9] "from" "government" "has" "its" "kuwait" "last" "market" "mln" [17] "new" "not" "official" "oil" "one" "opec" "pct" "price" [25] "prices" "reuter" "said" "said." "saudi" "sheikh" "that" "the" [33] "they" "u.s." "was" "were" "will" "with" "would" > # 1) 패키지 불러오기> library(rJava)Error in library(rJava) : there is no package called ‘rJava’> install.packages("rJava")trying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/rJava_1.0-6.tgz'Content> type 'application/x-gzip' length 1117163 bytes (1.1 MB)==================================================downloaded 1.1 MBThe downloaded binary packages are in /var/folders/7f/v36kfprn6n75t6zv9kbffnz40000gn/T//Rtmp731rdd/downloaded_packagesRestarting R session...> install.packages("NLP")trying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/NLP_0.2-1.tgz'Content> type 'application/x-gzip' length 389630 bytes (380 KB)==================================================downloaded 380 KBThe downloaded binary packages are in /var/folders/7f/v36kfprn6n75t6zv9kbffnz40000gn/T//RtmpJbSSo9/downloaded_packages> install.packages("wordcloud")trying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/wordcloud_2.6.tgz'Content> type 'application/x-gzip' length 240231 bytes (234 KB)==================================================downloaded 234 KBThe downloaded binary packages are in /var/folders/7f/v36kfprn6n75t6zv9kbffnz40000gn/T//RtmpJbSSo9/downloaded_packages> install.packages("plyr")trying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/plyr_1.8.6.tgz'Content> type 'application/x-gzip' length 971470 bytes (948 KB)==================================================downloaded 948 KBThe downloaded binary packages are in /var/folders/7f/v36kfprn6n75t6zv9kbffnz40000gn/T//RtmpJbSSo9/downloaded_packages> install.packages("twitteR")Warning in install.packages : dependency ‘rjson’ is not availablealso installing the dependency ‘DBI’trying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/DBI_1.1.2.tgz'Content> type 'application/x-gzip' length 667638 bytes (651 KB)==================================================downloaded 651 KBtrying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/twitteR_1.1.9.tgz'Content> type 'application/x-gzip' length 537986 bytes (525 KB)==================================================downloaded 525 KBThe downloaded binary packages are in /var/folders/7f/v36kfprn6n75t6zv9kbffnz40000gn/T//RtmpJbSSo9/downloaded_packages> install.packages("RColorBrewer")trying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/RColorBrewer_1.1-2.tgz'Content> type 'application/x-gzip' length 53161 bytes (51 KB)==================================================downloaded 51 KBThe downloaded binary packages are in /var/folders/7f/v36kfprn6n75t6zv9kbffnz40000gn/T//RtmpJbSSo9/downloaded_packages> install.packages("ROAuth")also installing the dependencies ‘bitops’, ‘RCurl’trying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/bitops_1.0-7.tgz'Content> type 'application/x-gzip' length 29283 bytes (28 KB)==================================================downloaded 28 KBtrying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/RCurl_1.98-1.6.tgz'Content> type 'application/x-gzip' length 1031584 bytes (1007 KB)==================================================downloaded 1007 KBtrying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/ROAuth_0.9.6.tgz'Content> type 'application/x-gzip' length 66860 bytes (65 KB)==================================================downloaded 65 KBThe downloaded binary packages are in /var/folders/7f/v36kfprn6n75t6zv9kbffnz40000gn/T//RtmpJbSSo9/downloaded_packages> # 1) 패키지 불러오기> library(rJava)> library(KoNLP)Error in library(KoNLP) : there is no package called ‘KoNLP’> library(wordcloud)Loading required package: RColorBrewer> library(RColorBrewer)> library(wordcloud)> library(NLP)> library(wordcloud)> library(twitteR)Error: package or namespace load failed for ‘twitteR’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): there is no package called ‘rjson’> library(plyr)> library(twitteR)Error: package or namespace load failed for ‘twitteR’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): there is no package called ‘rjson’> install.packages("RJSONIO")trying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/RJSONIO_1.3-1.6.tgz'Content> type 'application/x-gzip' length 1368906 bytes (1.3 MB)==================================================downloaded 1.3 MBThe downloaded binary packages are in /var/folders/7f/v36kfprn6n75t6zv9kbffnz40000gn/T//RtmpJbSSo9/downloaded_packages> library(twitteR)Error: package or namespace load failed for ‘twitteR’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): there is no package called ‘rjson’> library(tm)> library(ROAuth)> library(ggplot2) Attaching package: ‘ggplot2’The following object is masked from ‘package:NLP’: annotate> # 2)트위터 계정 접속> api_key <- '입력필요'> library(ggplot2) > library(RJSONIO)> library(twitteR)Error: package or namespace load failed for ‘twitteR’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]): there is no package called ‘rjson’> # 2)트위터 계정 접속> api_key <- '입력필요'> api_secret <- '입력필요'> access_token <- '입력필요'> access_token_secret <- '입력필요'> setup_twitter_oauth(api_key,api_secret,access_token,access_token_secret)Error in setup_twitter_oauth(api_key, api_secret, access_token, access_token_secret) : could not find function "setup_twitter_oauth"> # 3) 트럼프 관련 트윗 수집> keyword <- enc2utf8("#트럼프")> result <- searchTwitter(keyword, since='2016-10-01',lang="ko",n=10000)Error in searchTwitter(keyword, since = "2016-10-01", lang = "ko", n = 10000) : could not find function "searchTwitter"> # 4) 문자에 해당하는 부분만 추출> result.df <- twListToDF(result)Error in twListToDF(result) : could not find function "twListToDF"> result.text <- result.df$textError: object 'result.df' not found> # 5) 정제 작업> result.text <- gsub("\\n", "", result.text)Error in gsub("\\n", "", result.text) : object 'result.text' not found> result.text <- gsub("\\r", "", result.text)Error in gsub("\\r", "", result.text) : object 'result.text' not found> result.text <- gsub("RT", "", result.text)Error in gsub("RT", "", result.text) : object 'result.text' not found> result.text <- gsub("http", "", result.text)Error in gsub("http", "", result.text) : object 'result.text' not found> result.text <- gsub("CO", "", result.text)Error in gsub("CO", "", result.text) : object 'result.text' not found> result.text <- gsub("co", "", result.text)Error in gsub("co", "", result.text) : object 'result.text' not found> result.text <- gsub("ㅋㅋ", "", result.text)Error in gsub("ㅋㅋ", "", result.text) : object 'result.text' not found> result.text <- gsub("ㅋㅋㅋ", "", result.text)Error in gsub("ㅋㅋㅋ", "", result.text) : object 'result.text' not found> result.text <- gsub("ㅋㅋㅋㅋ", "", result.text)Error in gsub("ㅋㅋㅋㅋ", "", result.text) : object 'result.text' not found> result.text <- gsub("ㅠㅠ", "", result.text)Error in gsub("ㅠㅠ", "", result.text) : object 'result.text' not found> result_nouns <- Map(extractNoun, result.text)Error in match.fun(f) : object 'extractNoun' not found> result_wordsvec <- unlist(result_nouns, use.name=F)Error in unlist(result_nouns, use.name = F) : object 'result_nouns' not found> result_wordsvec <- result_wordsvec[-which(result_wordsvec %in% stopwords("english"))]Error: object 'result_wordsvec' not found> result_wordsvec <- gsub("[[:punct:]]","", result_wordsvec)Error in gsub("[[:punct:]]", "", result_wordsvec) : object 'result_wordsvec' not found> result_wordsvec <- Filter(function(x){nchar(x)>=2}, result_wordsvec)Error in lapply(x, f) : object 'result_wordsvec' not found> # 문자 카운팅> result_wordcount <- table(result_wordsvec)Error in table(result_wordsvec) : object 'result_wordsvec' not found> 트위터API설정절차 <- read.table("~/DataAnalytics/R/R_실습예제/lesson_7/data_file/트위터API설정절차.txt", header=TRUE, quote="\\"")Error in read.table("~/DataAnalytics/R/R_실습예제/lesson_7/data_file/트위터API설정절차.txt", : more columns than column namesIn addition: Warning messages:1: In grep("^[^#].*", lines, value = TRUE) : input string 4 is invalid in this locale2: In grep("^[^#].*", lines, value = TRUE) : input string 5 is invalid in this locale3: In grep("^[^#].*", lines, value = TRUE) : input string 6 is invalid in this locale4: In grep("^[^#].*", lines, value = TRUE) : input string 7 is invalid in this locale5: In grep("^[^#].*", lines, value = TRUE) : input string 8 is invalid in this locale> View(트위터API설정절차)Error in View : object '트위터API설정절차' not found> # 1) 데이터 불러오기기> iris Sepal.Length Sepal.Width Petal.Length Petal.Width Species1 5.1 3.5 1.4 0.2 setosa2 4.9 3.0 1.4 0.2 setosa3 4.7 3.2 1.3 0.2 setosa4 4.6 3.1 1.5 0.2 setosa5 5.0 3.6 1.4 0.2 setosa6 5.4 3.9 1.7 0.4 setosa7 4.6 3.4 1.4 0.3 setosa8 5.0 3.4 1.5 0.2 setosa9 4.4 2.9 1.4 0.2 setosa10 4.9 3.1 1.5 0.1 setosa11 5.4 3.7 1.5 0.2 setosa12 4.8 3.4 1.6 0.2 setosa13 4.8 3.0 1.4 0.1 setosa14 4.3 3.0 1.1 0.1 setosa15 5.8 4.0 1.2 0.2 setosa16 5.7 4.4 1.5 0.4 setosa17 5.4 3.9 1.3 0.4 setosa18 5.1 3.5 1.4 0.3 setosa19 5.7 3.8 1.7 0.3 setosa20 5.1 3.8 1.5 0.3 setosa21 5.4 3.4 1.7 0.2 setosa22 5.1 3.7 1.5 0.4 setosa23 4.6 3.6 1.0 0.2 setosa24 5.1 3.3 1.7 0.5 setosa25 4.8 3.4 1.9 0.2 setosa26 5.0 3.0 1.6 0.2 setosa27 5.0 3.4 1.6 0.4 setosa28 5.2 3.5 1.5 0.2 setosa29 5.2 3.4 1.4 0.2 setosa30 4.7 3.2 1.6 0.2 setosa31 4.8 3.1 1.6 0.2 setosa32 5.4 3.4 1.5 0.4 setosa33 5.2 4.1 1.5 0.1 setosa34 5.5 4.2 1.4 0.2 setosa35 4.9 3.1 1.5 0.2 setosa36 5.0 3.2 1.2 0.2 setosa37 5.5 3.5 1.3 0.2 setosa38 4.9 3.6 1.4 0.1 setosa39 4.4 3.0 1.3 0.2 setosa40 5.1 3.4 1.5 0.2 setosa41 5.0 3.5 1.3 0.3 setosa42 4.5 2.3 1.3 0.3 setosa43 4.4 3.2 1.3 0.2 setosa44 5.0 3.5 1.6 0.6 setosa45 5.1 3.8 1.9 0.4 setosa46 4.8 3.0 1.4 0.3 setosa47 5.1 3.8 1.6 0.2 setosa48 4.6 3.2 1.4 0.2 setosa49 5.3 3.7 1.5 0.2 setosa50 5.0 3.3 1.4 0.2 setosa51 7.0 3.2 4.7 1.4 versicolor52 6.4 3.2 4.5 1.5 versicolor53 6.9 3.1 4.9 1.5 versicolor54 5.5 2.3 4.0 1.3 versicolor55 6.5 2.8 4.6 1.5 versicolor56 5.7 2.8 4.5 1.3 versicolor57 6.3 3.3 4.7 1.6 versicolor58 4.9 2.4 3.3 1.0 versicolor59 6.6 2.9 4.6 1.3 versicolor60 5.2 2.7 3.9 1.4 versicolor61 5.0 2.0 3.5 1.0 versicolor62 5.9 3.0 4.2 1.5 versicolor63 6.0 2.2 4.0 1.0 versicolor64 6.1 2.9 4.7 1.4 versicolor65 5.6 2.9 3.6 1.3 versicolor66 6.7 3.1 4.4 1.4 versicolor67 5.6 3.0 4.5 1.5 versicolor68 5.8 2.7 4.1 1.0 versicolor69 6.2 2.2 4.5 1.5 versicolor70 5.6 2.5 3.9 1.1 versicolor71 5.9 3.2 4.8 1.8 versicolor72 6.1 2.8 4.0 1.3 versicolor73 6.3 2.5 4.9 1.5 versicolor74 6.1 2.8 4.7 1.2 versicolor75 6.4 2.9 4.3 1.3 versicolor76 6.6 3.0 4.4 1.4 versicolor77 6.8 2.8 4.8 1.4 versicolor78 6.7 3.0 5.0 1.7 versicolor79 6.0 2.9 4.5 1.5 versicolor80 5.7 2.6 3.5 1.0 versicolor81 5.5 2.4 3.8 1.1 versicolor82 5.5 2.4 3.7 1.0 versicolor83 5.8 2.7 3.9 1.2 versicolor84 6.0 2.7 5.1 1.6 versicolor85 5.4 3.0 4.5 1.5 versicolor86 6.0 3.4 4.5 1.6 versicolor87 6.7 3.1 4.7 1.5 versicolor88 6.3 2.3 4.4 1.3 versicolor89 5.6 3.0 4.1 1.3 versicolor90 5.5 2.5 4.0 1.3 versicolor91 5.5 2.6 4.4 1.2 versicolor92 6.1 3.0 4.6 1.4 versicolor93 5.8 2.6 4.0 1.2 versicolor94 5.0 2.3 3.3 1.0 versicolor95 5.6 2.7 4.2 1.3 versicolor96 5.7 3.0 4.2 1.2 versicolor97 5.7 2.9 4.2 1.3 versicolor98 6.2 2.9 4.3 1.3 versicolor99 5.1 2.5 3.0 1.1 versicolor100 5.7 2.8 4.1 1.3 versicolor101 6.3 3.3 6.0 2.5 virginica102 5.8 2.7 5.1 1.9 virginica103 7.1 3.0 5.9 2.1 virginica104 6.3 2.9 5.6 1.8 virginica105 6.5 3.0 5.8 2.2 virginica106 7.6 3.0 6.6 2.1 virginica107 4.9 2.5 4.5 1.7 virginica108 7.3 2.9 6.3 1.8 virginica109 6.7 2.5 5.8 1.8 virginica110 7.2 3.6 6.1 2.5 virginica111 6.5 3.2 5.1 2.0 virginica112 6.4 2.7 5.3 1.9 virginica113 6.8 3.0 5.5 2.1 virginica114 5.7 2.5 5.0 2.0 virginica115 5.8 2.8 5.1 2.4 virginica116 6.4 3.2 5.3 2.3 virginica117 6.5 3.0 5.5 1.8 virginica118 7.7 3.8 6.7 2.2 virginica119 7.7 2.6 6.9 2.3 virginica120 6.0 2.2 5.0 1.5 virginica121 6.9 3.2 5.7 2.3 virginica122 5.6 2.8 4.9 2.0 virginica123 7.7 2.8 6.7 2.0 virginica124 6.3 2.7 4.9 1.8 virginica125 6.7 3.3 5.7 2.1 virginica126 7.2 3.2 6.0 1.8 virginica127 6.2 2.8 4.8 1.8 virginica128 6.1 3.0 4.9 1.8 virginica129 6.4 2.8 5.6 2.1 virginica130 7.2 3.0 5.8 1.6 virginica131 7.4 2.8 6.1 1.9 virginica132 7.9 3.8 6.4 2.0 virginica133 6.4 2.8 5.6 2.2 virginica134 6.3 2.8 5.1 1.5 virginica135 6.1 2.6 5.6 1.4 virginica136 7.7 3.0 6.1 2.3 virginica137 6.3 3.4 5.6 2.4 virginica138 6.4 3.1 5.5 1.8 virginica139 6.0 3.0 4.8 1.8 virginica140 6.9 3.1 5.4 2.1 virginica141 6.7 3.1 5.6 2.4 virginica142 6.9 3.1 5.1 2.3 virginica143 5.8 2.7 5.1 1.9 virginica144 6.8 3.2 5.9 2.3 virginica145 6.7 3.3 5.7 2.5 virginica146 6.7 3.0 5.2 2.3 virginica147 6.3 2.5 5.0 1.9 virginica148 6.5 3.0 5.2 2.0 virginica149 6.2 3.4 5.4 2.3 virginica150 5.9 3.0 5.1 1.8 virginica> mtcars mpg cyl disp hp drat wt qsec vs am gear carbMazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2> # 2) 다양한 통계값 산출> # 평균, 중간값, 표준편차 구하기> m1<-mean(iris$Sepal.Length)> m1[1] 5.843333> m2<-median(iris$Sepal.Length)> m2[1] 5.8> s1Error: object 's1' not found> s1<-sd(iris$Sepal.Length)> s1[1] 0.8280661> iris$Sepal.Length_z<-(iris$Sepal.Length-m1)/s1> head(iris) Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.Length_z1 5.1 3.5 1.4 0.2 setosa -0.89767392 4.9 3.0 1.4 0.2 setosa -1.13920053 4.7 3.2 1.3 0.2 setosa -1.38072714 4.6 3.1 1.5 0.2 setosa -1.50149045 5.0 3.6 1.4 0.2 setosa -1.01843726 5.4 3.9 1.7 0.4 setosa -0.5353840> # 4) 상관분석> # iris 데이터에 있는 연속형 변수 각각의 조합과 상관관계를 파악한다> # 일반적으로 많이 사용되는 pearson상관계수로 계산한다.> cor(iris,method=c("pearson"))Error in cor(iris, method = c("pearson")) : 'x' must be numeric> # 4) 상관분석> # iris 데이터에 있는 연속형 변수 각각의 조합과 상관관계를 파악한다> # 일반적으로 많이 사용되는 pearson상관계수로 계산한다.> # cor(iris,method=c("pearson"))> cor(iris[,c(1:4)],method=c("pearson")) Sepal.Length Sepal.Width Petal.Length Petal.WidthSepal.Length 1.0000000 -0.1175698 0.8717538 0.8179411Sepal.Width -0.1175698 1.0000000 -0.4284401 -0.3661259Petal.Length 0.8717538 -0.4284401 1.0000000 0.9628654Petal.Width 0.8179411 -0.3661259 0.9628654 1.0000000> library(PerformanceAnalytics)Error in library(PerformanceAnalytics) : there is no package called ‘PerformanceAnalytics’> install.packages("PerformanceAnalytics")also installing the dependencies ‘xts’, ‘quadprog’trying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/xts_0.12.1.tgz'Content> type 'application/x-gzip' length 930515 bytes (908 KB)==================================================downloaded 908 KBtrying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/quadprog_1.5-8.tgz'Content> type 'application/x-gzip' length 39295 bytes (38 KB)==================================================downloaded 38 KBtrying URL '<https://cran.rstudio.com/bin/macosx/el-capitan/contrib/3.6/PerformanceAnalytics_2.0.4.tgz'Content> type 'application/x-gzip' length 3161270 bytes (3.0 MB)==================================================downloaded 3.0 MBThe downloaded binary packages are in /var/folders/7f/v36kfprn6n75t6zv9kbffnz40000gn/T//RtmpJbSSo9/downloaded_packages> library(PerformanceAnalytics)Loading required package: xtsLoading required package: zooAttaching package: ‘zoo’The following objects are masked from ‘package:base’: as.Date, as.Date.numericAttaching package: ‘PerformanceAnalytics’The following object is masked from ‘package:wordcloud’: textplotThe following object is masked from ‘package:graphics’: legend> library(PerformanceAnalytics)> chart.Correlation(iris[,c(1:4)],pch=19)> # 5) t 검정 > iris_test<-subset(iris,Species=="setosa" | Species =="virginica")> library(PerformanceAnalytics)> chart.Correlation(iris[,c(1:4)],pch=19)> # 5) t 검정 > # 평균비교할때 사용한다> iris_test<-subset(iris,Species=="setosa" | Species =="virginica")> boxplot(Sepal.Length~Species,data=iris_test)> t.test(iris_test$Sepal.Length~iris_test$Species,var.equal=T) Two Sample t-testdata: iris_test$Sepal.Length by iris_test$Speciest = -15.386, df = 98, p-value < 2.2e-16alternative hypothesis: true difference in means is not equal to 095 percent confidence interval: -1.786042 -1.377958sample estimates: mean in group setosa mean in group virginica 5.006 6.588 > # 6) 카이제곱 검정> table(mtcars$vs,mtcars$cyl) 4 6 8 0 1 3 14 1 10 4 0> chisq.test(mtcars$vs,mtcars$cyl) Pearson's Chi-squared testdata: mtcars$vs and mtcars$cylX-squared = 21.34, df = 2, p-value = 2.323e-05Warning message:In chisq.test(mtcars$vs, mtcars$cyl) : Chi-squared approximation may be incorrect> AirPassengers Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec1949 112 118 132 129 121 135 148 148 136 119 104 1181950 115 126 141 135 125 149 170 170 158 133 114 1401951 145 150 178 163 172 178 199 199 184 162 146 1661952 171 180 193 181 183 218 230 242 209 191 172 1941953 196 196 236 235 229 243 264 272 237 211 180 2011954 204 188 235 227 234 264 302 293 259 229 203 2291955 242 233 267 269 270 315 364 347 312 274 237 2781956 284 277 317 313 318 374 413 405 355 306 271 3061957 315 301 356 348 355 422 465 467 404 347 305 3361958 340 318 362 348 363 435 491 505 404 359 310 3371959 360 342 406 396 420 472 548 559 463 407 362 4051960 417 391 419 461 472 535 622 606 508 461 390 432> AirPassengers Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec1949 112 118 132 129 121 135 148 148 136 119 104 1181950 115 126 141 135 125 149 170 170 158 133 114 1401951 145 150 178 163 172 178 199 199 184 162 146 1661952 171 180 193 181 183 218 230 242 209 191 172 1941953 196 196 236 235 229 243 264 272 237 211 180 2011954 204 188 235 227 234 264 302 293 259 229 203 2291955 242 233 267 269 270 315 364 347 312 274 237 2781956 284 277 317 313 318 374 413 405 355 306 271 3061957 315 301 356 348 355 422 465 467 404 347 305 3361958 340 318 362 348 363 435 491 505 404 359 310 3371959 360 342 406 396 420 472 548 559 463 407 362 4051960 417 391 419 461 472 535 622 606 508 461 390 432> # 2) 시계열 데이터 시각화> ts.plot(AirPassengers)> title("1949~1960년 월별 탑승 승객")> # 3) 시계열 분해> # 데이터 쪼개기> ts<-decompose(AirPassengers)> ts$x Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec1949 112 118 132 129 121 135 148 148 136 119 104 1181950 115 126 141 135 125 149 170 170 158 133 114 1401951 145 150 178 163 172 178 199 199 184 162 146 1661952 171 180 193 181 183 218 230 242 209 191 172 1941953 196 196 236 235 229 243 264 272 237 211 180 2011954 204 188 235 227 234 264 302 293 259 229 203 2291955 242 233 267 269 270 315 364 347 312 274 237 2781956 284 277 317 313 318 374 413 405 355 306 271 3061957 315 301 356 348 355 422 465 467 404 347 305 3361958 340 318 362 348 363 435 491 505 404 359 310 3371959 360 342 406 396 420 472 548 559 463 407 362 4051960 417 391 419 461 472 535 622 606 508 461 390 432$seasonal Jan Feb Mar Apr May Jun Jul Aug Sep1949 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021950 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021951 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021952 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021953 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021954 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021955 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021956 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021957 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021958 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021959 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021960 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.520202 Oct Nov Dec1949 -20.642677 -53.593434 -28.6199491950 -20.642677 -53.593434 -28.6199491951 -20.642677 -53.593434 -28.6199491952 -20.642677 -53.593434 -28.6199491953 -20.642677 -53.593434 -28.6199491954 -20.642677 -53.593434 -28.6199491955 -20.642677 -53.593434 -28.6199491956 -20.642677 -53.593434 -28.6199491957 -20.642677 -53.593434 -28.6199491958 -20.642677 -53.593434 -28.6199491959 -20.642677 -53.593434 -28.6199491960 -20.642677 -53.593434 -28.619949$trend Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov1949 NA NA NA NA NA NA 126.7917 127.2500 127.9583 128.5833 129.00001950 131.2500 133.0833 134.9167 136.4167 137.4167 138.7500 140.9167 143.1667 145.7083 148.4167 151.54171951 157.1250 159.5417 161.8333 164.1250 166.6667 169.0833 171.2500 173.5833 175.4583 176.8333 178.04171952 183.1250 186.2083 189.0417 191.2917 193.5833 195.8333 198.0417 199.7500 202.2083 206.2500 210.41671953 215.8333 218.5000 220.9167 222.9167 224.0833 224.7083 225.3333 225.3333 224.9583 224.5833 224.45831954 228.0000 230.4583 232.2500 233.9167 235.6250 237.7500 240.5000 243.9583 247.1667 250.2500 253.50001955 261.8333 266.6667 271.1250 275.2083 278.5000 281.9583 285.7500 289.3333 293.2500 297.1667 301.00001956 309.9583 314.4167 318.6250 321.7500 324.5000 327.0833 329.5417 331.8333 334.4583 337.5417 340.54171957 348.2500 353.0000 357.6250 361.3750 364.5000 367.1667 369.4583 371.2083 372.1667 372.4167 372.75001958 375.2500 377.9167 379.5000 380.0000 380.7083 380.9583 381.8333 383.6667 386.5000 390.3333 394.70831959 402.5417 407.1667 411.8750 416.3333 420.5000 425.5000 430.7083 435.1250 437.7083 440.9583 445.83331960 456.3333 461.3750 465.2083 469.3333 472.7500 475.0417 NA NA NA NA NA Dec1949 129.75001950 154.70831951 180.16671952 213.37501953 225.54171954 257.12501955 305.45831956 344.08331957 373.62501958 398.62501959 450.62501960 NA$random Jan Feb Mar Apr May Jun Jul Aug1949 NA NA NA NA NA NA -42.6224747 -42.07323231950 8.4987374 29.1047980 8.3244949 6.6199495 -7.9103535 -25.1527778 -34.7474747 -35.98989901951 12.6237374 26.6464646 18.4078283 6.9116162 9.8396465 -26.4861111 -36.0808081 -37.40656571952 12.6237374 29.9797980 6.1994949 -2.2550505 -6.0770202 -13.2361111 -31.8724747 -20.57323231953 4.9154040 13.6881313 17.3244949 20.1199495 9.4229798 -17.1111111 -25.1641414 -16.15656571954 0.7487374 -6.2702020 4.9911616 1.1199495 2.8813131 -9.1527778 -2.3308081 -13.78156571955 4.9154040 2.5214646 -1.8838384 1.8282828 -3.9936869 -2.3611111 14.4191919 -5.15656571956 -1.2095960 -1.2285354 0.6161616 -0.7133838 -1.9936869 11.5138889 19.6275253 10.34343431957 -8.5012626 -15.8118687 0.6161616 -5.3383838 -4.9936869 19.4305556 31.7108586 32.96843431958 -10.5012626 -23.7285354 -15.2588384 -23.9633838 -13.2020202 18.6388889 45.3358586 58.51010101959 -17.7929293 -28.9785354 -3.6338384 -12.2967172 4.0063131 11.0972222 53.4608586 61.05176771960 -14.5845960 -34.1868687 -43.9671717 -0.2967172 3.7563131 24.5555556 NA NA Sep Oct Nov Dec1949 -8.4785354 11.0593434 28.5934343 16.86994951950 -4.2285354 5.2260101 16.0517677 13.91161621951 -7.9785354 5.8093434 21.5517677 14.45328281952 -9.7285354 5.3926768 15.1767677 9.24494951953 -4.4785354 7.0593434 9.1351010 4.07828281954 -4.6868687 -0.6073232 3.0934343 0.49494951955 2.2297980 -2.5239899 -10.4065657 1.16161621956 4.0214646 -10.8989899 -15.9482323 -9.46338381957 15.3131313 -4.7739899 -14.1565657 -9.00505051958 0.9797980 -10.6906566 -31.1148990 -33.00505051959 8.7714646 -13.3156566 -30.2398990 -17.00505051960 NA NA NA NA$figure [1] -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.520202[10] -20.642677 -53.593434 -28.619949$type[1] "additive"attr(,"class")[1] "decomposed.ts"> plot(ts)> ts$x Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec1949 112 118 132 129 121 135 148 148 136 119 104 1181950 115 126 141 135 125 149 170 170 158 133 114 1401951 145 150 178 163 172 178 199 199 184 162 146 1661952 171 180 193 181 183 218 230 242 209 191 172 1941953 196 196 236 235 229 243 264 272 237 211 180 2011954 204 188 235 227 234 264 302 293 259 229 203 2291955 242 233 267 269 270 315 364 347 312 274 237 2781956 284 277 317 313 318 374 413 405 355 306 271 3061957 315 301 356 348 355 422 465 467 404 347 305 3361958 340 318 362 348 363 435 491 505 404 359 310 3371959 360 342 406 396 420 472 548 559 463 407 362 4051960 417 391 419 461 472 535 622 606 508 461 390 432$seasonal Jan Feb Mar Apr May Jun Jul Aug Sep1949 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021950 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021951 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021952 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021953 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021954 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021955 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021956 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021957 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021958 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021959 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.5202021960 -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.520202 Oct Nov Dec1949 -20.642677 -53.593434 -28.6199491950 -20.642677 -53.593434 -28.6199491951 -20.642677 -53.593434 -28.6199491952 -20.642677 -53.593434 -28.6199491953 -20.642677 -53.593434 -28.6199491954 -20.642677 -53.593434 -28.6199491955 -20.642677 -53.593434 -28.6199491956 -20.642677 -53.593434 -28.6199491957 -20.642677 -53.593434 -28.6199491958 -20.642677 -53.593434 -28.6199491959 -20.642677 -53.593434 -28.6199491960 -20.642677 -53.593434 -28.619949$trend Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov1949 NA NA NA NA NA NA 126.7917 127.2500 127.9583 128.5833 129.00001950 131.2500 133.0833 134.9167 136.4167 137.4167 138.7500 140.9167 143.1667 145.7083 148.4167 151.54171951 157.1250 159.5417 161.8333 164.1250 166.6667 169.0833 171.2500 173.5833 175.4583 176.8333 178.04171952 183.1250 186.2083 189.0417 191.2917 193.5833 195.8333 198.0417 199.7500 202.2083 206.2500 210.41671953 215.8333 218.5000 220.9167 222.9167 224.0833 224.7083 225.3333 225.3333 224.9583 224.5833 224.45831954 228.0000 230.4583 232.2500 233.9167 235.6250 237.7500 240.5000 243.9583 247.1667 250.2500 253.50001955 261.8333 266.6667 271.1250 275.2083 278.5000 281.9583 285.7500 289.3333 293.2500 297.1667 301.00001956 309.9583 314.4167 318.6250 321.7500 324.5000 327.0833 329.5417 331.8333 334.4583 337.5417 340.54171957 348.2500 353.0000 357.6250 361.3750 364.5000 367.1667 369.4583 371.2083 372.1667 372.4167 372.75001958 375.2500 377.9167 379.5000 380.0000 380.7083 380.9583 381.8333 383.6667 386.5000 390.3333 394.70831959 402.5417 407.1667 411.8750 416.3333 420.5000 425.5000 430.7083 435.1250 437.7083 440.9583 445.83331960 456.3333 461.3750 465.2083 469.3333 472.7500 475.0417 NA NA NA NA NA Dec1949 129.75001950 154.70831951 180.16671952 213.37501953 225.54171954 257.12501955 305.45831956 344.08331957 373.62501958 398.62501959 450.62501960 NA$random Jan Feb Mar Apr May Jun Jul Aug1949 NA NA NA NA NA NA -42.6224747 -42.07323231950 8.4987374 29.1047980 8.3244949 6.6199495 -7.9103535 -25.1527778 -34.7474747 -35.98989901951 12.6237374 26.6464646 18.4078283 6.9116162 9.8396465 -26.4861111 -36.0808081 -37.40656571952 12.6237374 29.9797980 6.1994949 -2.2550505 -6.0770202 -13.2361111 -31.8724747 -20.57323231953 4.9154040 13.6881313 17.3244949 20.1199495 9.4229798 -17.1111111 -25.1641414 -16.15656571954 0.7487374 -6.2702020 4.9911616 1.1199495 2.8813131 -9.1527778 -2.3308081 -13.78156571955 4.9154040 2.5214646 -1.8838384 1.8282828 -3.9936869 -2.3611111 14.4191919 -5.15656571956 -1.2095960 -1.2285354 0.6161616 -0.7133838 -1.9936869 11.5138889 19.6275253 10.34343431957 -8.5012626 -15.8118687 0.6161616 -5.3383838 -4.9936869 19.4305556 31.7108586 32.96843431958 -10.5012626 -23.7285354 -15.2588384 -23.9633838 -13.2020202 18.6388889 45.3358586 58.51010101959 -17.7929293 -28.9785354 -3.6338384 -12.2967172 4.0063131 11.0972222 53.4608586 61.05176771960 -14.5845960 -34.1868687 -43.9671717 -0.2967172 3.7563131 24.5555556 NA NA Sep Oct Nov Dec1949 -8.4785354 11.0593434 28.5934343 16.86994951950 -4.2285354 5.2260101 16.0517677 13.91161621951 -7.9785354 5.8093434 21.5517677 14.45328281952 -9.7285354 5.3926768 15.1767677 9.24494951953 -4.4785354 7.0593434 9.1351010 4.07828281954 -4.6868687 -0.6073232 3.0934343 0.49494951955 2.2297980 -2.5239899 -10.4065657 1.16161621956 4.0214646 -10.8989899 -15.9482323 -9.46338381957 15.3131313 -4.7739899 -14.1565657 -9.00505051958 0.9797980 -10.6906566 -31.1148990 -33.00505051959 8.7714646 -13.3156566 -30.2398990 -17.00505051960 NA NA NA NA$figure [1] -24.748737 -36.188131 -2.241162 -8.036616 -4.506313 35.402778 63.830808 62.823232 16.520202[10] -20.642677 -53.593434 -28.619949$type[1] "additive"attr(,"class")[1] "decomposed.ts"> # 4) 시계열 분석 수행> fit <- arima(AirPassengers, order=c(1,0,0), list(order=c(2,1,0), period=12))> fitCall:arima(x = AirPassengers, order = c(1, 0, 0), seasonal = list(order = c(2, 1, 0), period = 12))Coefficients: ar1 sar1 sar2 0.9458 -0.1333 0.0821s.e. 0.0284 0.1035 0.1078sigma^2 estimated as 143.1: log likelihood = -516.18, aic = 1040.37> # 5) 시계열 예측 수행 > predict <- predict(fit, n.ahead=24)> predict$pred Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov1961 445.0772 418.6286 451.3255 485.0739 496.9859 555.4025 641.1830 627.2158 528.6446 478.3612 410.03841962 463.4606 435.4701 463.6918 501.9637 511.8873 571.0617 657.1925 640.0611 540.7620 491.0499 419.6633 Dec1961 452.42901962 461.3783$se Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov1961 11.96267 16.46600 19.63824 22.09347 24.07871 25.72521 27.11359 28.29798 29.31703 30.19955 30.967761962 35.68346 38.94721 41.65083 43.92872 45.87078 47.54098 48.98693 50.24524 51.34481 52.30891 53.15659 Dec1961 31.639201962 53.90364> # 6) visual> ts.plot(AirPassengers, predict$pred, col=c(1,2,4,4), lty = c(1,1,2,2))> # 기존 데이터> # 장기 트렌드> # 시즌 패턴> # 랜덤> plot(ts)>
반응형

'Data Science > R' 카테고리의 다른 글

[R] 다차원 시각화와 구글 연동  (0) 2022.03.01
[R] 데이터 시각화  (0) 2022.03.01
[R] 텍스트 데이터를 통해 의미를 도출  (0) 2022.03.01
[R] 알짜 고객 분류  (0) 2022.03.01
[R] 이상거래 탐지  (0) 2022.03.01

댓글