使用jmotif包的R中的SAX时间序列表示
问题描述:
我想用SAX表示一些时间序列图,以便我可以挖掘它们的相似之处。我正在使用R中的jmotif软件包:使用jmotif包的R中的SAX时间序列表示
#Create an example dataframe
example1 <- data.frame(flow=c(1.1,2.2,3.3,4.4,5.5,6.6),
weight1=c(7.1,7.2,7.3,7.4,7.5,7.6),
weight2=c(8.1,8.2,8.3,8.4,8.5,8.6))
# Create a timeseries object
examplets1 <- ts(example1, start = 1, end = 6)
#Analysis
library(jmotif)
#Normalise the data using Znorm
examplezn <- znorm(examplets1, threshold = 0.01)
#Perform piecewise aggregate approximation
examplepaa <- paa(examplezn, 3)
#Represent time series as SAX
sax_via_window(examplepaa, 3, 3, 10, "mindist", 0.1)
#This produces the result
> sax_via_window(examplepaa, 3, 3, 10, "mindist", 0.1)
$`0`
[1] "bgh"
我无法解释这些结果。我期望的是象征性的表示,我可以将它与每列相关联。流量:acc,weight1:bgh等。真正的数据集将有大约100列的ts数据!
我错误地应用该方法吗?
任何帮助是极大的赞赏
答
这里的问题是,我没有“矢量化” jmotif,所以它的功能只适用于数字的有序序列表示输入时间序列,即不以数据帧对象或时间序列对象。可争论的,但我只是想保持简单。
我没有修改代码中的位来执行任务,希望它有助于:
library(jmotif)
# create an example dataframe, list works the best cause library is not "vectorized"
example1 <- list(flow = c(1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, 8.8, 9.9),
weight1 = c(7.1, 7.2, 7.3, 7.4, 7.5, 7.6, 7.7, 8.8, 9.9),
weight2 = c(8.1, 8.2, 8.3, 8.4, 8.5, 8.6, 8.7, 8.8, 8.9))
# this library makes working with not-vectorized code easier
library(plyr)
# z-normalize
examplezn <- llply(example1, function(x){znorm(x, threshold = 0.01)})
# perform piecewise aggregate approximation, probably not needed for following up with SAX transform, so just for illustration ...
llply(examplezn, function(x){paa(x, 3)})
# represent time series as SAX strings using via window SAX transform
example_sax <- llply(example1, function(x){sax_via_window(x, 3, 2, 3, "none", 0.1)})
# convert the result to a data frame, by rows though
df_by_row <- ldply(example_sax, unlist)
# and finally obtain a column-oriented data frame
df_by_column <- as.data.frame(t(df_by_row))