Title: | Functional Change Point Detection and Analysis |
---|---|
Description: | Analyze functional data and its change points. Includes functionality to store and process data, summarize and validate assumptions, characterize and perform inference of change points, and provide visualizations. Data is stored as discretely collected observations without requiring the selection of basis functions. For more details see chapter 8 of Horvath and Rice (2024) <doi:10.1007/978-3-031-51609-2>. Additional papers are forthcoming. Focused works are also included in the documentation of corresponding functions. |
Authors: | Jeremy VanderDoes [aut, cre, cph]
|
Maintainer: | Jeremy VanderDoes <[email protected]> |
License: | GPL (>= 3) |
Version: | 2.0.0 |
Built: | 2025-03-29 08:26:35 UTC |
Source: | https://github.com/jrvanderdoes/fchange |
Extract or replace subsets of dfts objects.
## S3 method for class 'dfts' x[i, j, ...] ## S3 replacement method for class 'dfts' x[i, j] <- value
## S3 method for class 'dfts' x[i, j, ...] ## S3 replacement method for class 'dfts' x[i, j] <- value
x |
A dfts object. See |
i , j
|
Numerics for elements to extract. |
... |
Additional parameters from generic function for extensions. |
value |
A suitable replacement value for selection. |
A dfts object.
electricity[1:3] electricity[1:3,] electricity[1:2,1:4] electricity[,1:4] tmp <- dfts(matrix(1:9,3,3)) tmp$data tmp[1,1] <- 10 tmp$data
electricity[1:3] electricity[1:3,] electricity[1:2,1:4] electricity[,1:4] tmp <- dfts(matrix(1:9,3,3)) tmp$data tmp[1,1] <- 10 tmp$data
This function computes the ACF/PACF of data. This can be applied on traditional
scalar time series or functional time series defined in dfts()
.
acf(x, lag.max = NULL, ...) ## Default S3 method: acf(x, lag.max = NULL, ...) pacf(x, lag.max = NULL, ...) ## Default S3 method: pacf(x, lag.max = NULL, ...) ## S3 method for class 'dfts' acf( x, lag.max = NULL, alpha = 0.05, method = c("Welch", "MC", "Imhof"), WWN = TRUE, figure = TRUE, ... ) ## S3 method for class 'dfts' pacf(x, lag.max = NULL, n_pcs = NULL, alpha = 0.95, figure = TRUE, ...)
acf(x, lag.max = NULL, ...) ## Default S3 method: acf(x, lag.max = NULL, ...) pacf(x, lag.max = NULL, ...) ## Default S3 method: pacf(x, lag.max = NULL, ...) ## S3 method for class 'dfts' acf( x, lag.max = NULL, alpha = 0.05, method = c("Welch", "MC", "Imhof"), WWN = TRUE, figure = TRUE, ... ) ## S3 method for class 'dfts' pacf(x, lag.max = NULL, n_pcs = NULL, alpha = 0.95, figure = TRUE, ...)
x |
Object for computation of (partial) autocorrelation function
(see |
lag.max |
Number of lagged covariance estimators for the time series that will be used to estimate the (partial) autocorrelation function. |
... |
Further arguments passed to the |
alpha |
A value between 0 and 1 that indicates significant level for
the confidence interval for the i.i.d. bounds of the (partial) autocorrelation
function. By default |
method |
Character specifying the method to be used when estimating the distribution under the hypothesis of functional white noise. Accepted values are:
By default, |
WWN |
Logical. If |
figure |
Logical. If |
n_pcs |
Number of principal components that will be used to fit the ARH(p) models. |
List with ACF or PACF values and plots
acfs/pacfs
: Autocorrelation values for
each lag of the functional time series.
SWN_bound
: The upper prediction
bound for the i.i.d. distribution under strong white noise assumption.
WWN_bound
: The upper prediction
bound for the i.i.d. distribution under weak white noise assumption.
plot
: Plot of autocorrelation values for
each lag of the functional time series.
Mestre G., Portela J., Rice G., Munoz San Roque A., Alonso E. (2021). Functional time series model identification and diagnosis by means of auto- and partial autocorrelation analysis. Computational Statistics & Data Analysis, 155, 107108.
Mestre, G., Portela, J., Munoz San Roque, A., Alonso, E. (2020). Forecasting hourly supply curves in the Italian Day-Ahead electricity market with a double-seasonal SARMAHX model. International Journal of Electrical Power & Energy Systems, 121, 106083.
Kokoszka, P., Rice, G., Shang, H.L. (2017). Inference for the autocovariance of a functional time series under conditional heteroscedasticity Journal of Multivariate Analysis, 162, 32–50.
acf(1:10) x <- generate_brownian_bridge(100, seq(0,1,length.out=20)) acf(x,20) x <- generate_brownian_bridge(100, seq(0,1,length.out=20)) pacf(x,lag.max = 10, n_pcs = 2)
acf(1:10) x <- generate_brownian_bridge(100, seq(0,1,length.out=20)) acf(x,20) x <- generate_brownian_bridge(100, seq(0,1,length.out=20)) pacf(x,lag.max = 10, n_pcs = 2)
Computes the data-driven bandwidth using a method based on the spectral density operator which adapts to the functional data.
adaptive_bandwidth( X, kernel = bartlett_kernel, name = NULL, order = NULL, weighting = NULL )
adaptive_bandwidth( X, kernel = bartlett_kernel, name = NULL, order = NULL, weighting = NULL )
X |
A dfts object or data which can be automatically converted to that
format. See |
kernel |
Kernel function. No additional parameters are needed for
|
name , order , weighting
|
Additional parameters if non-standard kernels, e.g.
those not in fChange, are used. See references for the definitions. Name is
extracted from the kernel name to select |
Scalar value of the data-adapted bandwidth.
Rice, G., & Shang, H. L. (2017). A Plug‐in Bandwidth Selection Procedure for Long‐Run Covariance Estimation with Stationary Functional Time Series. Journal of Time Series Analysis, 38(4), 591–609.
bartlett_kernel()
, truncated_kernel()
, parzen_kernel()
,
tukey_hanning_kernel()
, quadratic_spectral_kernel()
, daniell_kernel()
,
flat_top_kernel()
adaptive_bandwidth(generate_brownian_motion(100)) adaptive_bandwidth(electricity, parzen_kernel)
adaptive_bandwidth(generate_brownian_motion(100)) adaptive_bandwidth(electricity, parzen_kernel)
Obtain the empirical autocorrelation function for the given lags
of a
functional time series, X
. Given a functional time series, the sample
autocovariance functions are given by:
where
denotes the sample mean function and
is the lag parameter. The
autocorrelation functions are defined over the range
by
normalizing these functions using the factor
.
autocorrelation(X, lags)
autocorrelation(X, lags)
X |
A dfts object or data which can be automatically converted to that
format. See |
lags |
Numeric(s) for the lags to estimate the lagged operator. |
Return a list or data.frame with the lagged autocorrelation function(s)
estimated from the data. Each function is given by a x
matrix, where
is the number of points observed in each curve.
N <- 100 v <- seq(from = 0, to = 1, length.out = 10) bbridge <- generate_brownian_bridge(N = N, v = v) lagged_autocor <- autocorrelation(X = bbridge, lags = 0:1)
N <- 100 v <- seq(from = 0, to = 1, length.out = 10) bbridge <- generate_brownian_bridge(N = N, v = v) lagged_autocor <- autocorrelation(X = bbridge, lags = 0:1)
Obtain the empirical autocovariance function for the given lags
of a
functional time series, X
. Given a functional time series, the sample
autocovariance functions are given by:
where
denotes the sample mean function and
is the lag parameter.
autocovariance(X, lags = 0:1, center = TRUE)
autocovariance(X, lags = 0:1, center = TRUE)
X |
A dfts object or data which can be automatically converted to that
format. See |
lags |
Numeric(s) for the lags to estimate the lagged operator. |
center |
Boolean if the data should be centered. Default is true. |
Return a list or data.frame with the lagged autocovariance function(s)
estimated from the data. Each function is given by a x
matrix, where
is the number of points observed in each curve.
v <- seq(0,1,length.out=20) lagged_autocov <- autocovariance( X = generate_brownian_bridge(100,v=v), lags = 1)
v <- seq(0,1,length.out=20) lagged_autocov <- autocovariance( X = generate_brownian_bridge(100,v=v), lags = 1)
Compute the pointwise "average" values for dfts objects such as mean and median.
## S3 method for class 'dfts' mean(x, na.rm = TRUE, ...) ## S3 method for class 'dfts' median(x, na.rm = TRUE, ...)
## S3 method for class 'dfts' mean(x, na.rm = TRUE, ...) ## S3 method for class 'dfts' median(x, na.rm = TRUE, ...)
x |
A dfts object or data which can be automatically converted to that
format. See |
na.rm |
Boolean if NA values should be removed. Defaults to TRUE. |
... |
Additional parameters to pass to base R's |
Numeric vector
results <- mean(electricity) results <- median(electricity)
results <- mean(electricity) results <- median(electricity)
Percentage of cause-specific deaths out of total deaths for female breast cancer in the United States from 1950 to 2021.
cancer
cancer
cancer
A data.frame with columns being the year and rows the age groups (5 years).
Center data by removing the mean or median. Defining changes allow for regional centering.
center(object, changes = NULL, type = "mean", ...) ## Default S3 method: center(object, changes = NULL, type = "mean", ...) ## S3 method for class 'data.frame' center(object, changes = NULL, type = "mean", ...) ## S3 method for class 'matrix' center(object, changes = NULL, type = "mean", ...) ## S3 method for class 'dfts' center(object, changes = NULL, type = "mean", ...)
center(object, changes = NULL, type = "mean", ...) ## Default S3 method: center(object, changes = NULL, type = "mean", ...) ## S3 method for class 'data.frame' center(object, changes = NULL, type = "mean", ...) ## S3 method for class 'matrix' center(object, changes = NULL, type = "mean", ...) ## S3 method for class 'dfts' center(object, changes = NULL, type = "mean", ...)
object |
Object for computation of centering. |
changes |
Change points for centering individual sections. |
type |
String of |
... |
Parameters that may be fed into other versions of centering. |
Centered data of the same format as the given data.
center.default()
, center.data.frame()
,
center.matrix()
, center.dfts()
center(1:10)
center(1:10)
Compute confidence intervals for the data based on some changes. The current version is tuned to mean changes.
confidence_interval( X, changes, K = bartlett_kernel, h = 2 * ncol(X)^(1/5), weighting = 0.5, M = 5000, alpha = 0.1, method = "distribution" )
confidence_interval( X, changes, K = bartlett_kernel, h = 2 * ncol(X)^(1/5), weighting = 0.5, M = 5000, alpha = 0.1, method = "distribution" )
X |
A dfts object or data which can be automatically converted to that
format. See |
changes |
Numeric vector for detected change points. |
K |
Function for the Kernel. Default is bartlett_kernel. |
h |
Numeric for bandwidth in computation of long run variance. The default
is |
weighting |
Weighting for the interval computation, value in [0,1]. Default is 0.5. |
M |
Numeric for the number of Brownian motion simulations in computation of the confidence interval. Default is 1000. |
alpha |
Numeric for the significance level, in [0,1]. Default is 0.1. |
method |
String to indicate the method for computing the percentiles used in the confidence intervals. The options are 'distribution' and 'simulation'. Default is 'distribution'. |
Data.frame with the first column for the changes, second for the lower bounds of confidence intervals, and the third for the upper bounds of confidence intervals.
Horvath, L., & Rice, G. (2024). Change Point Analysis for Time Series (First edition.). Springer.
Aue, A., Rice, G., & Sonmez, O. (2018). Detecting and dating structural breaks in functional data without dimension reduction. Journal of the Royal Statistical Society. Series B, Statistical Methodology, 80(3), 509-529.
X <- cbind(generate_brownian_motion(100,v=seq(0,1,0.05))$data, generate_brownian_motion(100,v=seq(0,1,0.05))$data+1000) confidence_interval(X,changes = 100) confidence_interval(X,changes=100,method = 'simulation') X <- cbind(generate_brownian_motion(100,v=seq(0,1,0.05))$data, generate_brownian_motion(100,v=seq(0,1,0.05))$data+0.5) confidence_interval(X,100,alpha = 0.1) confidence_interval(X,changes=100,alpha = 0.1,method = 'simulation') X <- generate_brownian_motion(200,v=seq(0,1,0.05)) confidence_interval(X,100) confidence_interval(X,100,method = 'simulation') X <- cbind(generate_brownian_motion(200,v=seq(0,1,0.05))$data, generate_brownian_motion(100,v=seq(0,1,0.05))$data+0.1, generate_brownian_motion(150,v=seq(0,1,0.05))$data-0.05) confidence_interval(X,c(200,300)) confidence_interval(X = electricity, changes = c(64, 120),alpha = 0.1)
X <- cbind(generate_brownian_motion(100,v=seq(0,1,0.05))$data, generate_brownian_motion(100,v=seq(0,1,0.05))$data+1000) confidence_interval(X,changes = 100) confidence_interval(X,changes=100,method = 'simulation') X <- cbind(generate_brownian_motion(100,v=seq(0,1,0.05))$data, generate_brownian_motion(100,v=seq(0,1,0.05))$data+0.5) confidence_interval(X,100,alpha = 0.1) confidence_interval(X,changes=100,alpha = 0.1,method = 'simulation') X <- generate_brownian_motion(200,v=seq(0,1,0.05)) confidence_interval(X,100) confidence_interval(X,100,method = 'simulation') X <- cbind(generate_brownian_motion(200,v=seq(0,1,0.05))$data, generate_brownian_motion(100,v=seq(0,1,0.05))$data+0.1, generate_brownian_motion(150,v=seq(0,1,0.05))$data-0.05) confidence_interval(X,c(200,300)) confidence_interval(X = electricity, changes = c(64, 120),alpha = 0.1)
The discrete functional time series (dfts) object is the main object in fChange. It stores functional data for use in functions throughout the package. Common functions have been extended to dfts. Details of the storage is best left to individual parameters descriptions and exploring examples.
dfts(X, name = NULL, labels = NULL, fparam = NULL, inc.warnings = TRUE) as.dfts(X) is.dfts(X)
dfts(X, name = NULL, labels = NULL, fparam = NULL, inc.warnings = TRUE) as.dfts(X) is.dfts(X)
X |
Data to convert into dfts object. Options include: data.frame, matrix, array, fda::fd, fda.usc::fdata, rainbow::fts (used in ftsa), rainbow::fds (used in ftsa), funData::funData, and dfts. For a matrix, each column is a unique observation, at the rows are the observed intra-observation (i.e. resolution) points. |
name |
String for the name of the object. Defaults to the name of the input variable. |
labels |
Labels for the observations. Defaults to the column names or names inside of the object X. |
fparam |
Vector of numerics indicating the points of evaluation for each observation. Defaults to even spacing on [0,1], or those included in the object. These may be unevenly spaced. |
inc.warnings |
Boolean on if warnings should be given. Defaults to TRUE. |
dfts / as.dfts: dfts object
is.dfts: Boolean indicating if x
is a dfts object or not
bm <- dfts(generate_brownian_motion(100, c(0,0.1,0.25,0.5,1))) result <- dfts(electricity) result <- as.dfts(electricity) result <- is.dfts(electricity)
bm <- dfts(generate_brownian_motion(100, c(0,0.1,0.25,0.5,1))) result <- dfts(electricity) result <- as.dfts(electricity) result <- is.dfts(electricity)
Group generic methods defined for things like Math, Ops, and so forth.
## S3 method for class 'dfts' Math(x, ...) ## S3 method for class 'dfts' Ops(e1, e2) ## S3 method for class 'dfts' cumsum(x)
## S3 method for class 'dfts' Math(x, ...) ## S3 method for class 'dfts' Ops(e1, e2) ## S3 method for class 'dfts' cumsum(x)
x , e1 , e2
|
A dfts object. See |
... |
Further arguments passed to the methods. |
A dfts object with the applied operation
dfts object with data as cumsum
result <- sqrt( electricity ) result <- electricity + electricity result1 <- electricity * electricity cumsum(electricity)
result <- sqrt( electricity ) result <- electricity + electricity result1 <- electricity * electricity cumsum(electricity)
Difference the functional data at some lag and iteration. For the
'th difference at lag
, the differenced series is
defined as
where
is the backshift operator.
## S3 method for class 'dfts' diff(x, lag = 1L, differences = 1L, ...)
## S3 method for class 'dfts' diff(x, lag = 1L, differences = 1L, ...)
x |
A dfts object or data which can be automatically converted to that
format. See |
lag |
An integer indicating which lag to use. |
differences |
An integer indicating the order of the difference. |
... |
Further arguments to be passed to methods. |
A dfts object with the differenced values.
result <- diff(electricity, lag=1) result1 <- diff(electricity, differences=2)
result <- diff(electricity, lag=1) result1 <- diff(electricity, differences=2)
Retrieve the dimension of a dfts object.
## S3 method for class 'dfts' dim(x, ...)
## S3 method for class 'dfts' dim(x, ...)
x |
A dfts object or data which can be automatically converted to that
format. See |
... |
Additional parameters to pass to base R's |
Numerics indicating the dimension of the dfts object.
results <- dim(electricity)
results <- dim(electricity)
The hourly electricity spot prices from Spain in 2014.
electricity
electricity
electricity
A dfts object.
<www.omie.es>
Change point detection for dfts objects. Various change point methods are given, where single or multiple changes can be detected. Multiple change extensions currently include binary segmentation and elbow plots.
fchange( X, method = c("characteristic", "mean", "robustmean", "eigenjoint", "eigensingle", "trace", "covariance", "projmean", "projdistribution"), statistic = c("Tn", "Mn"), critical = c("simulation", "resample", "welch"), type = c("single", "segmentation", "elbow"), resample_blocks = "separate", replace = TRUE, max_changes = min(ncol(X), 20), changes = NULL, blocksize = 2 * ncol(X)^(1/5), eigen_number = 3, h = 2 * ncol(X)^(1/5), M = 1000, J = 50, W = space_measuring_functions(X = X, M = 20, space = "BM"), K = bartlett_kernel, alpha = 0.05, cov.res = 30, weighting = 1/4, TVE = 0.95, trim_function = function(X) { 0 }, errors = "L2", recommendation_change_points = 2, recommendation_improvement = 0.15, silent.binary = FALSE )
fchange( X, method = c("characteristic", "mean", "robustmean", "eigenjoint", "eigensingle", "trace", "covariance", "projmean", "projdistribution"), statistic = c("Tn", "Mn"), critical = c("simulation", "resample", "welch"), type = c("single", "segmentation", "elbow"), resample_blocks = "separate", replace = TRUE, max_changes = min(ncol(X), 20), changes = NULL, blocksize = 2 * ncol(X)^(1/5), eigen_number = 3, h = 2 * ncol(X)^(1/5), M = 1000, J = 50, W = space_measuring_functions(X = X, M = 20, space = "BM"), K = bartlett_kernel, alpha = 0.05, cov.res = 30, weighting = 1/4, TVE = 0.95, trim_function = function(X) { 0 }, errors = "L2", recommendation_change_points = 2, recommendation_improvement = 0.15, silent.binary = FALSE )
X |
A dfts object or data which can be automatically converted to that
format. See |
method |
Method to compute change point. Options include: 'characteristic', 'mean', 'robustmean', 'eigenjoint', 'eigensingle', 'trace', 'covariance', 'projmean', and 'projdistribution'. |
statistic |
String for the test statistic type: integrated, |
critical |
String for method of computing threshold. Options are 'simulation', 'resample', and 'welch'. Not all ways to compute the critical thresholds are implemented for every method. |
type |
String for the type of change point detection, single change ('single'), binary segmentation ('segmentation'), or elbow plots ('elbow'). |
resample_blocks |
String indicating the type of resample test to use.
Using |
replace |
Boolean for using a permutation or bootstrapped statistic when
|
max_changes |
Integer as the max number of changes to search when using
type is |
changes |
Vector of change points to be given to the eigen test if the data should be centered on these values first. |
blocksize |
Integer for the width of the blocks when using a resampling
test. Can use |
eigen_number |
Which eigenvalue or the number of eigenvalues which should be checked in the eigenvalue tests. |
h |
Number of lags used when computing long run covariance estimates. Used in mean, characteristic, and eigenvalue tests. |
M |
Number of simulations or permutations for critical values |
J |
Resolution (J) in the characteristic method. The number of vectors
is defined by |
W |
Space measuring functions used in characteristic method to explore the functional space. |
K |
Kernel function for use in characteristic, mean, eigen, covariance and projmean. |
alpha |
Significance in [0,1] for Welch approximation. |
cov.res |
Resolution to use when computing covariance kernel changes. |
weighting |
Weights used in covariance kernel method and pcadistribution. |
TVE |
Total variance explained for projmean and projdistribution. |
trim_function |
Trimming to be used in multiple change methods. |
errors |
Type of errors used in elbow plot. Options are L2 and Trace. |
recommendation_change_points |
Number of lags forward to examine in deciding automated elbow plot recommendation. |
recommendation_improvement |
Significant drop to look for in deciding automated elbow plot recommendation. |
silent.binary |
Boolean if output should be printed when running binary segmentation. |
When type is single, returns a list:
pvalue: p-value for detection of a change point.
location: location of the most likely change.
When type is elbow:
information: data.frame with the information on each change and the decrease in variability.
plots: list of plots showing the variability decrease or improvement
suggestion: list with plot and algorithmic change suggestion. The suggested changes are also returned.
When type is segmentation a data.frame with the locations and p-values is returned.
Aue, A., Rice, G., & Sonmez, O. (2018). Detecting and dating structural breaks in functional data without dimension reduction. Journal of the Royal Statistical Society. Series B, Statistical Methodology, 80(3), 509-529.
Wegner, L., Wendler, M. Robust change-point detection for functional time series based on U-statistics and dependent wild bootstrap. Stat Papers (2024).
Aue, A., Rice, G., & Sonmez, O. (2020). Structural break analysis for spectrum and trace of covariance operators. Environmetrics (London, Ont.), 31(1)
Horvath, L., Rice, G., & Zhao, Y. (2022). Change point analysis of covariance functions: A weighted cumulative sum approach. Journal of Multivariate Analysis, 189, 104877-.
Berkes, I., Gabrys, R.,Horvath, L. & P. Kokoszka (2009)., Detecting changes in the mean of functional observations Journal of the Royal Statistical Society, Series B 71, 927-946
Aue, A., Gabrys, R.,Horvath, L. & P. Kokoszka (2009)., Estimation of a change-point in the mean function of functional data Journal of Multivariate Analysis 100, 2254-2269.
Huskova, M., & Meintanis, S.G. (2006). Change Point Analysis based on Empirical Characteristic Functions. Metrika, 63, 145-168.
res <- fchange(electricity$data[,1:20],method='characteristic',critical = 'welch')
res <- fchange(electricity$data[,1:20],method='characteristic',critical = 'welch')
Generate a functional time series from an iid Brownian Bridge process.
If is a Wiener process, the Brownian Bridge is
defined as
. Each functional observation is discretized on
the points indicated in
v
.
generate_brownian_bridge(N, v = 30, sd = 1)
generate_brownian_bridge(N, v = 30, sd = 1)
N |
Numeric. The number of observations for the generated data. |
v |
Numeric (vector). Discretization points of the curves.This can be the evaluated points or the number of evenly spaced points on [0,1]. By default it is evenly spaced on [0,1] with 30 points. |
sd |
Numeric. Standard deviation of the Brownian Motion process.
The default is |
Functional time series (dfts) object.
bbridge <- generate_brownian_bridge(N=100, v=c(0,0.2,0.6,1,1.3), sd=2) bbridge <- generate_brownian_bridge(N=100, v=10, sd=1)
bbridge <- generate_brownian_bridge(N=100, v=c(0,0.2,0.6,1,1.3), sd=2) bbridge <- generate_brownian_bridge(N=100, v=10, sd=1)
Generate a functional time series according to an iid Brownian Motion process.
Each observation is discretized on the points indicated in v
.
generate_brownian_motion(N, v = 30, sd = 1)
generate_brownian_motion(N, v = 30, sd = 1)
N |
Numeric. The number of observations for the generated data. |
v |
Numeric (vector). Discretization points of the curves.This can be the evaluated points or the number of evenly spaced points on [0,1]. By default it is evenly spaced on [0,1] with 30 points. |
sd |
Numeric. Standard deviation of the Brownian Motion process.
The default is |
Functional time series (dfts) object.
bmotion <- generate_brownian_motion(N=100, v=c(0,0.25,0.4,0.7, 1, 1.5), sd = 1) bmotion1 <- generate_brownian_motion(N=100, v=10, sd = 2)
bmotion <- generate_brownian_motion(N=100, v=c(0,0.25,0.4,0.7, 1, 1.5), sd = 1) bmotion1 <- generate_brownian_motion(N=100, v=10, sd = 2)
A general wrapper function to allow generation of functional data according
to several approaches: bbridge
, bmotion
, kl
, ou
,
and far1
.
generate_data(fparam, data_details, burnin = 100)
generate_data(fparam, data_details, burnin = 100)
fparam |
fparam of data (or resolution that will be equally spaced on [0,1]). |
data_details |
List of named lists indicating parameters for each data group. Each process can use different parameters, given below.
|
burnin |
Numeric for amount of burnin for data. Only used for the first groups. Subsequent groups begin at the end of the last group. |
A dfts object for the generated data.
generate_brownian_bridge()
, generate_brownian_motion()
,
generate_far1()
, generate_karhunen_loeve()
,
generate_ornstein_uhlenbeck()
result <- generate_data( fparam=15, data_details =list('bmotion'=list('N'=100, 'sd'=1), 'bbridge'=list('N'=100, 'sd'=1), 'bbridge'=list('N'=100, 'sd'=1), 'kl'=list('N'=100, 'distribution'='Normal', 'eigenvalues'=1/1:4, 'mean'=0, 'dependence'=0, 'basis'=fda::create.bspline.basis(), 'sd'=1), 'ou'=list('N'=100, 'dependence'=0 ) , 'far1'=list('N'=100, 'dependence'=0, 'sd'=1,'vary'=FALSE) ) )
result <- generate_data( fparam=15, data_details =list('bmotion'=list('N'=100, 'sd'=1), 'bbridge'=list('N'=100, 'sd'=1), 'bbridge'=list('N'=100, 'sd'=1), 'kl'=list('N'=100, 'distribution'='Normal', 'eigenvalues'=1/1:4, 'mean'=0, 'dependence'=0, 'basis'=fda::create.bspline.basis(), 'sd'=1), 'ou'=list('N'=100, 'dependence'=0 ) , 'far1'=list('N'=100, 'dependence'=0, 'sd'=1,'vary'=FALSE) ) )
Function to generate data according to FAR(1) process.
generate_far1(N, resolution, sd = 1, dependence = 1/2, drop_first = FALSE)
generate_far1(N, resolution, sd = 1, dependence = 1/2, drop_first = FALSE)
N |
Numeric for the number of observations. |
resolution |
Numeric for resolution of data or a vector specifying the observation points. |
sd |
Numeric for standard deviation with Brownian motion. |
dependence |
Numeric which indicates the dependence on the previous curve. |
drop_first |
Booolean if first values should be dropped so the data varies at the first rather than starting at 0 (given that is the observed first point). Note this will affect the resolution observed. |
dfts object of the data.
res <- generate_far1(20,10)
res <- generate_far1(20,10)
generate_karhunen_loeve
generates functional data via an autoregressive
Karhunen-Loeve expansion. The approach easily accomondate change points in
the mean, distribution, eigenvalues, eigenfunctions, and so forth. In a
sense, the function creates m ‘groups’ of discretely observed
functions with similar properties.
generate_karhunen_loeve( Ns, eigenvalues, basis, means, distribution, fparam, dependence = 0, burnin = 100, silent = TRUE, dof = NULL, shape = NULL, prev_eps = NULL )
generate_karhunen_loeve( Ns, eigenvalues, basis, means, distribution, fparam, dependence = 0, burnin = 100, silent = TRUE, dof = NULL, shape = NULL, prev_eps = NULL )
Ns |
Vector of Numerics. Each value in Ns is the number of observations for a given group, for m groups. |
eigenvalues |
Vector of eigenvalues, length 1 or m. |
basis |
A list of bases (eigenfunctions), length m. |
means |
A vector of means, length 1 or Ns. |
distribution |
A vector of distributions, length 1 or m. |
fparam |
A vector of points indicating the points to evaluate the functions on. |
dependence |
Numeric [0,1] indicating strength of VAR(1) process. |
burnin |
A numeric value indicating the number of burnin trials. |
silent |
A Boolean that toggles displaying the running status. |
dof |
Numeric for shape with gamma distribution (rate is set to 1). |
shape |
Numeric for degrees of freedom with t-distribution. |
prev_eps |
Previous epsilon for dependence across groups. This is only needed if a separate code was run but the new data should be appended. In general only used in internal functions. |
List with (1) dfts data and (2) the errors of the last iteration.
dat1 <- generate_karhunen_loeve( Ns=100, eigenvalues=c(1/(1:3)), basis=fda::create.bspline.basis(nbasis=3,norder=3), means=0, distribution='Normal', fparam=seq(0,1,0.1), dependence=0, burnin=100, silent=TRUE, dof=NULL, shape=NULL, prev_eps=NULL) dat2 <- generate_karhunen_loeve( Ns=50, eigenvalues=c(1/(1:4)), basis=fda::create.bspline.basis(nbasis=4), means=5, distribution='exponential', fparam=seq(0,1,0.1), dependence=0, burnin=100, silent=TRUE, dof=NULL, shape=NULL, prev_eps=dat1$prev_eps) dat <- dfts(cbind(dat1$data$data, dat2$data$data),fparam = dat1$data$fparam)
dat1 <- generate_karhunen_loeve( Ns=100, eigenvalues=c(1/(1:3)), basis=fda::create.bspline.basis(nbasis=3,norder=3), means=0, distribution='Normal', fparam=seq(0,1,0.1), dependence=0, burnin=100, silent=TRUE, dof=NULL, shape=NULL, prev_eps=NULL) dat2 <- generate_karhunen_loeve( Ns=50, eigenvalues=c(1/(1:4)), basis=fda::create.bspline.basis(nbasis=4), means=5, distribution='exponential', fparam=seq(0,1,0.1), dependence=0, burnin=100, silent=TRUE, dof=NULL, shape=NULL, prev_eps=dat1$prev_eps) dat <- dfts(cbind(dat1$data$data, dat2$data$data),fparam = dat1$data$fparam)
Generate autoregressive process with errors according the Ornstein-Uhlenbeck process.
generate_ornstein_uhlenbeck(N, v, rho = 0)
generate_ornstein_uhlenbeck(N, v, rho = 0)
N |
Numeric for the number of observations. |
v |
Numeric for resolution of data or a vector specifying the observation points. |
rho |
Numeric which indicates the dependence on the previous curve. |
A dfts object for the generated data.
generate_ornstein_uhlenbeck(N=100,v=20)
generate_ornstein_uhlenbeck(N=100,v=20)
Several basic imputation methods for missing values in functional data formatted as dfts objects.
impute( X, method = c("zero", "mean_obs", "median_obs", "mean_data", "median_data", "linear", "functional"), obs_share_data = FALSE )
impute( X, method = c("zero", "mean_obs", "median_obs", "mean_data", "median_data", "linear", "functional"), obs_share_data = FALSE )
X |
A dfts object or data which can be automatically converted to that
format. See |
method |
String to indicate method of imputation.
|
obs_share_data |
Boolean in linear interpolation that indicates if data should be shared across observations. For example, if the end of observation i related to the start of observation i+1. Default is FALSE, which suggests independence. If true, the distance between the end and start of observations is taken to be the mean average distance of points in fparam. |
A dfts object of the data with missing values interpolated.
temp <- data.frame(c(NA,NA,3:9,NA), c(NA,stats::rnorm(2),NA, stats::rnorm(6)), stats::rnorm(10), c(stats::rnorm(4),rep(NA,3), stats::rnorm(3)), rep(NA,10), c(stats::rnorm(1), rep(NA,9)), c(stats::rnorm(9),NA), stats::rnorm(10), stats::rnorm(10), c(NA,NA,3:9, NA)) impute(temp, method='mean_obs') impute(temp, method='linear', obs_share_data=TRUE)
temp <- data.frame(c(NA,NA,3:9,NA), c(NA,stats::rnorm(2),NA, stats::rnorm(6)), stats::rnorm(10), c(stats::rnorm(4),rep(NA,3), stats::rnorm(3)), rep(NA,10), c(stats::rnorm(1), rep(NA,9)), c(stats::rnorm(9),NA), stats::rnorm(10), stats::rnorm(10), c(NA,NA,3:9, NA)) impute(temp, method='mean_obs') impute(temp, method='linear', obs_share_data=TRUE)
There are an assortment of (vectorized) kernel functions located in the package.
Truncated Kernel: Kernel where and
otherwise. If
then the value
is given.
Bartlett Kernel: Kernel where . If
then the value
is given.
Parzen Kernel: Kernel where ,
, and
. If
then
the value
is given.
Tukey-Hanning Kernel: Kernel where
and
. If
then the value
is given.
Quadratic Spectral Kernel: Kernel where
.
If
then the value
is given.
Daniell Kernel: Kernel where .
If
then the value
is given.
Flat-Top Kernel: Kernel where .
If
then the value
is given.
truncated_kernel(x) bartlett_kernel(x) parzen_kernel(x) tukey_hanning_kernel(x) quadratic_spectral_kernel(x) daniell_kernel(x) flat_top_kernel(x)
truncated_kernel(x) bartlett_kernel(x) parzen_kernel(x) tukey_hanning_kernel(x) quadratic_spectral_kernel(x) daniell_kernel(x) flat_top_kernel(x)
x |
Numeric value(s) at which to evaluate kernel. It often indicates current lag divided by window. |
Values from given lag(s) in the kernel.
Horvath, L., & Rice, G. (2024). Change point analysis for time series (1st ed. 2024.). Springer Nature Switzerland.
L. Horvath, P. Kokoszka, G. Rice (2014) "Testing stationarity of functional time series", Journal of Econometrics, 179(1), 66-82.
Politis, D. N. (2003). Adaptive bandwidth choice. Journal of Nonparametric Statistics, 15(4-5), 517-533.
Politis, D. N. (2011). Higher-order accurate, positive semidefinite estimation of large-sample covariance and spectral density matrices. Econometric Theory, 27(4), 703-744.
truncated_kernel(-20:20/15) bartlett_kernel(-20:20/15) parzen_kernel(-20:20/15) tukey_hanning_kernel(-20:20/15) quadratic_spectral_kernel(-20:20/15) daniell_kernel(-20:20/15) flat_top_kernel(-20:20/15)
truncated_kernel(-20:20/15) bartlett_kernel(-20:20/15) parzen_kernel(-20:20/15) tukey_hanning_kernel(-20:20/15) quadratic_spectral_kernel(-20:20/15) daniell_kernel(-20:20/15) flat_top_kernel(-20:20/15)
Compute the Kwiatkowski–Phillips–Schmidt–Shin (KPSS) statistic for functional data.
kpss_test( X, method = c("simulation", "resample"), resample_blocks = "separate", M = 1000, blocksize = 2 * ncol(X)^(1/5), TVE = 1, replace = TRUE )
kpss_test( X, method = c("simulation", "resample"), resample_blocks = "separate", M = 1000, blocksize = 2 * ncol(X)^(1/5), TVE = 1, replace = TRUE )
X |
A dfts object or data which can be automatically converted to that
format. See |
method |
String for the method in computing thresholds: Monte Carlo
simulation ( |
resample_blocks |
String indicating the type of resample test to use.
Using |
M |
Number of simulations to estimate theoretical distribution. |
blocksize |
Numeric for the block size when using a resample test. |
TVE |
Numeric for |
replace |
Boolean to indicate if blocks should be selected with replacement when using a resample test. |
List with the following elements:
pvalue: p-value from the test.
statistic: test statistic computed on the data.
simulations: Theoretical values for the null distribution.
Chen, Y., & Pun, C. S. (2019). A bootstrap-based KPSS test for functional time series. Journal of Multivariate Analysis, 174, 104535.
Kokoszka, P., & Young, G. (2016). KPSS test for functional time series. Statistics, 50(5), 957-973.
kpss_test(generate_brownian_motion(100,v=seq(0,1,length.out=20))) kpss_test(generate_brownian_motion(100,v=seq(0,1,length.out=20)), method="resample") kpss_test(generate_brownian_motion(100,v=seq(0,1,length.out=20)), method='resample', resample_blocks='overlapping')
kpss_test(generate_brownian_motion(100,v=seq(0,1,length.out=20))) kpss_test(generate_brownian_motion(100,v=seq(0,1,length.out=20)), method="resample") kpss_test(generate_brownian_motion(100,v=seq(0,1,length.out=20)), method='resample', resample_blocks='overlapping')
Compute a lagged version of a functional time series, shifting the time base back by a given number of observations.
## S3 method for class 'dfts' lag(x, k = 1, ...)
## S3 method for class 'dfts' lag(x, k = 1, ...)
x |
A dfts object. See |
k |
Integer for the number of lags (in units of observations). |
... |
Unused additional parameters. |
A dfts object.
result <- lag(electricity)
result <- lag(electricity)
Estimate the long-run covariance kernel for functional data. That is, solve
with sequence
defined as the centered
data (can center based on changes if given).
long_run_covariance( X, h = 2 * ncol(X)^(1/5), K = bartlett_kernel, changes = NULL )
long_run_covariance( X, h = 2 * ncol(X)^(1/5), K = bartlett_kernel, changes = NULL )
X |
A dfts object or data which can be automatically converted to that
format. See |
h |
The window parameter parameter for the estimation of the long run
covariance kernel. The default value is |
K |
Function indicating the kernel to use if |
changes |
Vector of numeric change point locations. Can be NULL. |
Symmetric data.frame of numerics with dim of ncol(data) x ncol(data).
result <- long_run_covariance(electricity,2)
result <- long_run_covariance(electricity,2)
Get the observation(s) or pointwise values with the min / max values. When
using type='Obs'
, the selected observation is the one with the
minimum or maximum mean. When using type='fparam'
, the values are
given pointwise.
## S3 method for class 'dfts' max(x, type = c("Obs", "fparam"), na.rm = TRUE, ...) ## S3 method for class 'dfts' min(x, type = c("Obs", "fparam"), na.rm = TRUE, ...)
## S3 method for class 'dfts' max(x, type = c("Obs", "fparam"), na.rm = TRUE, ...) ## S3 method for class 'dfts' min(x, type = c("Obs", "fparam"), na.rm = TRUE, ...)
x |
A dfts object or data which can be automatically converted to that
format. See |
type |
String indicating if finding for observation ('Obs'), or for pointwise values ('fparam'). |
na.rm |
Boolean if NA values should be removed. Defaults to TRUE. |
... |
Additional parameters to pass to base R's |
A dfts object.
results <- max(electricity) results <- min(electricity)
results <- max(electricity) results <- min(electricity)
This is a generic function to call PCA on various objects. The default method
uses stats::prcomp()
.
pca(object, TVE = 1, ...) ## Default S3 method: pca(object, ...) ## S3 method for class 'dfts' pca(object, TVE = 1, ...)
pca(object, TVE = 1, ...) ## Default S3 method: pca(object, ...) ## S3 method for class 'dfts' pca(object, TVE = 1, ...)
object |
Object for computation of principle components analysis. |
TVE |
Numeric in [0,1] for the total variance explained, this determines the number of components and can be used for dimension reduction. |
... |
Additional parameters to extensions based on data. Often this is
additional information for |
Principal component data.
pca(1:10) pca(electricity)
pca(1:10) pca(electricity)
Computes the principal component projection and re-combined data with a myriad of figures to understand the projection process.
pca_examination(X, TVE = 0.95)
pca_examination(X, TVE = 0.95)
X |
A dfts object or data which can be automatically converted to that
format. See |
TVE |
Numeric in [0,1] for the total variance explained, this determines the number of components and can be used for dimension reduction. |
List with the following elements:
figures: List with figures on all components, summaries, reconstructed values, and residuals.
reconstruction: Reconstructed value from PCs.
residuals: Difference between true and reconstruction.
results <- pca_examination(electricity, TVE=0.9)
results <- pca_examination(electricity, TVE=0.9)
Provides various visualizations for functional data.
## S3 method for class 'dfts' plot( x, changes = NULL, type = c("spaghetti", "fast", "rainbow", "banded", "acf", "pacf", "summary", "qq", "distribution", "change", "interval", "surface"), plot_title = x$name, val_axis_title = NULL, res_axis_title = NULL, FD_axis_title = NULL, eye = NULL, aspectratio = NULL, showticklabels = TRUE, lag.max = 20, d.max = 2, alpha = 0.05, TVE = 0.95, distribution = c("norm"), method = c("Welch", "MC", "Imhof"), legend = TRUE, highlight_changes = TRUE, intervals = confidence_interval(x, changes), int.gradual = TRUE, ... )
## S3 method for class 'dfts' plot( x, changes = NULL, type = c("spaghetti", "fast", "rainbow", "banded", "acf", "pacf", "summary", "qq", "distribution", "change", "interval", "surface"), plot_title = x$name, val_axis_title = NULL, res_axis_title = NULL, FD_axis_title = NULL, eye = NULL, aspectratio = NULL, showticklabels = TRUE, lag.max = 20, d.max = 2, alpha = 0.05, TVE = 0.95, distribution = c("norm"), method = c("Welch", "MC", "Imhof"), legend = TRUE, highlight_changes = TRUE, intervals = confidence_interval(x, changes), int.gradual = TRUE, ... )
x |
A dfts object or data which can be automatically converted to that
format. See |
changes |
Vector of numeric change point locations. Can be NULL. |
type |
Choice of plotting method. Options include: 'spaghetti', 'fast', 'rainbow','banded','acf', 'pacf', 'summary', 'qq', 'distribution', 'change', 'interval', and'surface'. |
plot_title |
Title to include on the return plot. |
val_axis_title , res_axis_title , FD_axis_title
|
Title for the axis giving the values (val), the resolution of the fparam (res), and the functional observations (FD). |
eye , aspectratio
|
Angle (eye) and ratio (aspectratio) to view 3d plots. |
showticklabels |
Boolean if the tick marks should be shown. |
lag.max |
Max number of lags to consider for ACF/PACF and summary plots. |
d.max |
Max number of dimensions for qq/distribution and summary plots. |
alpha |
Significance level to be used in various plots. Value in [0,1]. |
TVE |
Total variance explained used in qq/distribution plots. Value in [0,1]. |
distribution |
String of the distribution to compare against in. distribution plot. The string can be anything such that there is a rdistribution and ddistribution function available. For example "exp", "gamma". Additional parameters can be passed using ... . |
method |
Method for computing ACF/PACF. Options include 'Welch', 'MC', and 'Imhof'. |
legend |
Boolean if legend should be given in qq/distribution plots. |
highlight_changes |
Boolean if changes should be highlighted in black. |
intervals |
Information on confidence intervals of changes for change
plot. See |
int.gradual |
Boolean if confidence interval be solid gray (FALSE) or
gradual colors (TRUE) when |
... |
Details for plotting in acf/pacf, summary, or distribution function. |
Plot of varying types.
John Fox, & Sanford Weisberg (2019). An R Companion to Applied Regression. Sage.
plt <- plot(electricity) plt <- plot(dfts(var(electricity)), type='surface')
plt <- plot(electricity) plt <- plot(dfts(var(electricity)), type='surface')
Computes a variety of portmanteau hypothesis tests for functional data in the form of dfts objects.
portmanteau_tests( X, test = c("variety", "single", "multi", "spectral", "independence", "imhof"), lag = 5, M = 1000, method = c("iid", "bootstrap"), kernel = bartlett_kernel, block_size = NULL, bandwidth = NULL, components = 3, resample_blocks = "separate", replace = FALSE, alpha = 0.05 )
portmanteau_tests( X, test = c("variety", "single", "multi", "spectral", "independence", "imhof"), lag = 5, M = 1000, method = c("iid", "bootstrap"), kernel = bartlett_kernel, block_size = NULL, bandwidth = NULL, components = 3, resample_blocks = "separate", replace = FALSE, alpha = 0.05 )
X |
A dfts object or data which can be automatically converted to that
format. See |
test |
A String specifying the hypothesis test. Currently available tests are: 'variety', 'single-lag', 'multi-lag', 'spectral', 'independence', and 'imhof'. |
lag |
A positive integer to specify the lag, or maximum lag, of interest. Only used for the "single-lag", "multi-lag", "independence", and "imhof" tests. |
M |
Numeric to specify the number of Monte-Carlo or resampled simulations to use for the limiting distributions. |
method |
String indicating the method for the
Additional methods are forthcoming. |
kernel |
Kernel function for spectral test or estimation of covariance. |
block_size |
Numeric to specify block size for resampling tests. |
bandwidth |
Numeric for bandwidth of covariance estimation. If left null,
with be defined by |
components |
Number of functional principal components to use in the independence test. |
resample_blocks |
String indicating the type of resample test to use.
Using |
replace |
Boolean for using a permutation or bootstrapped statistic when
|
alpha |
Numeric value for significance in [0,1]. |
The "single"-lag portmanteau test assesses the significance of
empirical lagged autocovariance operators at a single lag lag
.
It tests the null hypothesis that the lag-h autocovariance operator is equal
to 0. The test is designed for stationary functional time-series, and is
valid under conditional heteroscedasticity conditions.
The "multi"-lag portmanteau test assesses the cumulative significance of
empirical lagged autocovariance operators, up to a user-selected maximum lag
lag
. It tests the null hypothesis that the first lag-h autocovariance
operators, , is equal to 0. The test is designed for
stationary functional time-series, and is valid under conditional
heteroscedasticity conditions.
The "spectral" portmanteau test measures the proximity of a functional time series to a white noise. Comparison is made to the constant spectral density operator of an uncorrelated series. The test is not for general white noise series, and may not hold under functional conditionally heteroscedastic assumptions.
The "independence" portmanteau test measures independence and identical distribution based lagged cross-variances from dimension reduction using functional principal components analysis. The test is not for general white noise series, and may not hold under functional conditionally heteroscedastic assumptions.
The "imhof" portmanteau test is an analogue of the "single-lag" test. While the "single-lag" test computes the limiting distribution of the test statistic via a Welch-Satterthwaite approximation, the "imhof" test directly computes the coefficients of the quadratic form in normal variables. Hence, the test is computationally expensive.
List with results dependent on the test. In general, returns the pvalue, statistic, and simulations/quantile.
Kim, M., Kokoszka P., & Rice G. (2023) White noise testing for functional time series. Statist. Surv., 17, 119-168, DOI: 10.1214/23-SS143
Characiejus, V., & Rice, G. (2020). A general white noise test based on kernel lag-window estimates of the spectral density operator. Econometrics and Statistics, 13, 175–196.
Kokoszka P., & Rice G., & Shang H.L. (2017). Inference for the autocovariance of a functional time series under conditional heteroscedasticity. Journal of Multivariate Analysis, 162, 32-50.
Zhang X. (2016). White noise testing and model diagnostic checking for functional time series. Journal of Econometrics, 194, 76-95.
Gabrys R., & Kokoszka P. (2007). Portmanteau Test of Independence for Functional Observations. Journal of the American Statistical Association, 102:480, 1338-1348, DOI: 10.1198/016214507000001111.
Chen W.W. & Deo R.S. (2004). Power transformations to induce normality and their applications. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 66, 117-130.
b <- generate_brownian_motion(250) res0 <- portmanteau_tests(b, test = 'single', lag = 10) res1 <- portmanteau_tests(b, test = 'multi', lag = 10, alpha = 0.01) res2 <- portmanteau_tests(b, test = 'spectral', kernel = bartlett_kernel, bandwidth = NULL, alpha = 0.05) res3 <- portmanteau_tests(b, test = 'spectral', alpha = 0.1, kernel = parzen_kernel, bandwidth = adaptive_bandwidth(b, kernel=parzen_kernel)) res4 <- portmanteau_tests(b, test = 'independence', components = 3, lag = 3) res5 <- portmanteau_tests(b, test = 'single', lag = 1, M = 250)
b <- generate_brownian_motion(250) res0 <- portmanteau_tests(b, test = 'single', lag = 10) res1 <- portmanteau_tests(b, test = 'multi', lag = 10, alpha = 0.01) res2 <- portmanteau_tests(b, test = 'spectral', kernel = bartlett_kernel, bandwidth = NULL, alpha = 0.05) res3 <- portmanteau_tests(b, test = 'spectral', alpha = 0.1, kernel = parzen_kernel, bandwidth = adaptive_bandwidth(b, kernel=parzen_kernel)) res4 <- portmanteau_tests(b, test = 'independence', components = 3, lag = 3) res5 <- portmanteau_tests(b, test = 'single', lag = 1, M = 250)
Basic formatting to print a dfts object.
## S3 method for class 'dfts' print(x, ...)
## S3 method for class 'dfts' print(x, ...)
x |
A dfts object. See |
... |
unused parameter. |
No return value, called to format printing of dfts object to console.
electricity
electricity
Model and forecast functional data using a Hyndman and Ullah projection-based model.
projection_model( X, TVE = 0.95, model = c("ets", "arima"), n.ahead = 0, alpha = 0.05, check.cp = TRUE, frequency = 1, ... )
projection_model( X, TVE = 0.95, model = c("ets", "arima"), n.ahead = 0, alpha = 0.05, check.cp = TRUE, frequency = 1, ... )
X |
A dfts object or data which can be automatically converted to that
format. See |
TVE |
Numeric in [0,1] for the total variance explained to select number of PCA components to use to model the data. |
model |
String to indicate method to model components, either "ets" or "arima". |
n.ahead |
Number of observations to forecast. |
alpha |
Significance in [0,1] for intervals when forecasting. |
check.cp |
Boolean which indicates if the errors should be checked for change points to change forecasts and plots. |
frequency |
Numeric for seasonal frequency when component is made a ts object for the models. |
... |
Additional information to pass into pca, change (if
|
List with the following elements:
fit: dfts object for fit.
forecast_plot: plot of the data with any forecasted values.
residuals: dfts object for residuals from the fit.
changes: vector of any changes when using detect.cp
.
component_models: modeled PCs from the data.
component_true: true data constucted via the PCs.
parameters: list with fit parameters like pcs, TVE, model, and n.ahead.
Hyndman, R. J., & Shahid Ullah, M. (2007). Robust forecasting of mortality and fertility rates: A functional data approach. Computational Statistics & Data Analysis, 51(10), 4942-4956. https://doi.org/10.1016/j.csda.2006.07.028
result <- projection_model(dfts(electricity$data[,50:100]), n.ahead=1, TVE=0.1, check.cp=FALSE)
result <- projection_model(dfts(electricity$data[,50:100]), n.ahead=1, TVE=0.1, check.cp=FALSE)
A generic function which by produces a qq-plot of some data. By default, it
uses stats::qqplot()
.
qqplot.dfts: Creates normal QQ plots on the principal components of functional data.
qqplot(x, ...) ## Default S3 method: qqplot(x, ...) ## S3 method for class 'dfts' qqplot( x, TVE = 0.95, d.max = NULL, alpha = 0.05, changes = NULL, legend = FALSE, ... )
qqplot(x, ...) ## Default S3 method: qqplot(x, ...) ## S3 method for class 'dfts' qqplot( x, TVE = 0.95, d.max = NULL, alpha = 0.05, changes = NULL, legend = FALSE, ... )
x |
A dfts object. See |
... |
Additional parameters based on the data. |
TVE |
Numeric in [0,1] giving the total variance explained for selecting the number of principal components. |
d.max |
Max number of principal components. No max when NULL. |
alpha |
Significance level, alpha in [0,1]. |
changes |
Vector of change points. |
legend |
Boolean indicating if legend should be shown on plot. |
qqplot.default: returns results from stats::qqplot()
.
qqplot.dfts: ggplot2 for QQ plot.
result <- qqplot(electricity, d.max=3)
result <- qqplot(electricity, d.max=3)
Obtain the pointwise quantile information of functional data.
## S3 method for class 'dfts' quantile(x, probs = seq(0, 1, 0.25), ...)
## S3 method for class 'dfts' quantile(x, probs = seq(0, 1, 0.25), ...)
x |
A dfts object. See |
probs |
Numerics in [0,1] indicating the probabilities of interest. |
... |
Additional parameters to pass into |
Matrix with columns for each requested quantile computed pointwise.
result <- quantile(electricity) result1 <- quantile(electricity,probs = 0.95)
result <- quantile(electricity) result1 <- quantile(electricity,probs = 0.95)
Yield curves in the US for 1 - 360 month maturity from 1990 to 2023 (removing days without information, i.e. weekends and holidays).
rates
rates
rates
A dfts object.
Generic function to compute the variance and standard deviations. The default
uses stats::var()
and stats::sd()
.
sd(object, ...) ## Default S3 method: sd(object, ...) ## S3 method for class 'dfts' sd(object, type = "pointwise", ...) var(object, ...) ## Default S3 method: var(object, ...) ## S3 method for class 'dfts' var(object, type = c("operator", "pointwise"), ...)
sd(object, ...) ## Default S3 method: sd(object, ...) ## S3 method for class 'dfts' sd(object, type = "pointwise", ...) var(object, ...) ## Default S3 method: var(object, ...) ## S3 method for class 'dfts' var(object, type = c("operator", "pointwise"), ...)
object |
Object for computation of standard deviation or variance of the given data set. |
... |
Additional parameters for the particular extensions. |
type |
String to specify if an operator ('op') or pointwise ('pw') calculation is desired on the functional data. |
Numeric(s) to explain the standard deviation / variance
sd(1:10) var(1:10) sd(electricity,type='pointwise') var(electricity,type='pointwise') var(electricity,type='operator')
sd(1:10) var(1:10) sd(electricity,type='pointwise') var(electricity,type='pointwise') var(electricity,type='operator')
This function is used to compute discretized functions, i.e. vectors, to explore functional spaces.
space_measuring_functions(X, M = 20, space = "BM")
space_measuring_functions(X, M = 20, space = "BM")
X |
A dfts object or data which can be automatically converted to that
format. See |
M |
Integer for the number of functions to generate. |
space |
String for the space of interest. Options are Brownian motion ('BM'), principal components ('PC'), and vectors in iid standard, random normals ('RN'). Additional options are forthcoming |
Data.frame with columns of discretized functions describing the space. Columns are independent functions.
space_measuring_functions(M=10, space="BM", X=electricity) space_measuring_functions(M=10, space="PC", X=electricity)
space_measuring_functions(M=10, space="BM", X=electricity) space_measuring_functions(M=10, space="PC", X=electricity)
Intraday prices for the S&P500 index (SPY) for 2019 to 2023 with holidays and weekends removed. Minutely resolution and daily observations.
SPYUS500
SPYUS500
SPY500
A dfts object.
This function adds 3D geoms such as points and paths to a ggplot2 plot.
stat_3D( mapping = NULL, data = NULL, geom = "point", position = "identity", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, ... )
stat_3D( mapping = NULL, data = NULL, geom = "point", position = "identity", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, ... )
mapping |
Default list of aesthetic mappings to use for plot. If not specified, must be supplied in each layer added to the plot. |
data |
Default dataset to use for plot. If not already a data.frame,
will be converted to one by |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
Set position information. Find more details in |
na.rm |
Boolean if na data should be removed. |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
... |
Arguments passed on to layer. Often the aesthetics
like |
No direct return value, called to be used with ggplot2::ggplot()
in designing the plot.
Acker D (2024). gg3D: 3D perspective plots for ggplot2. R package version 0.0.0.9000.
dat <- electricity data_lines <- cbind(data.frame('Time'=dat$fparam), dat$data) %>% tidyr::pivot_longer(cols = 1+1:ncol(dat$data)) colors_plot <- RColorBrewer::brewer.pal(11, "Spectral") colors_plot <- grDevices::colorRampPalette(colors_plot)(ncol(dat$data)) data_lines$color <- rep(colors_plot, nrow(dat$data) ) data_lines$name <- as.numeric(data_lines$name) result <- ggplot2::ggplot(data_lines, ggplot2::aes(y=Time, x=name, z=value, color=color)) + ggplot2::theme_void() + stat_3D(theta=0, phi=15, geom='path') + ggplot2::scale_color_manual( breaks = data_lines$color, values = data_lines$color ) + ggplot2::guides(color='none')
dat <- electricity data_lines <- cbind(data.frame('Time'=dat$fparam), dat$data) %>% tidyr::pivot_longer(cols = 1+1:ncol(dat$data)) colors_plot <- RColorBrewer::brewer.pal(11, "Spectral") colors_plot <- grDevices::colorRampPalette(colors_plot)(ncol(dat$data)) data_lines$color <- rep(colors_plot, nrow(dat$data) ) data_lines$name <- as.numeric(data_lines$name) result <- ggplot2::ggplot(data_lines, ggplot2::aes(y=Time, x=name, z=value, color=color)) + ggplot2::theme_void() + stat_3D(theta=0, phi=15, geom='path') + ggplot2::scale_color_manual( breaks = data_lines$color, values = data_lines$color ) + ggplot2::guides(color='none')
Stationarity test for functional time series with different methods on determining the critical values of the test statistic. The Monte Carlo method was constructed in Horvath et al. (2014), while the resample-based methods have not been validated in the literature (use the provided option at your discretion).
stationarity_test( X, statistic = "Tn", critical = c("simulation", "resample"), perm_method = "separate", M = 1000, blocksize = 2 * ncol(X)^(1/5), TVE = 1, replace = TRUE )
stationarity_test( X, statistic = "Tn", critical = c("simulation", "resample"), perm_method = "separate", M = 1000, blocksize = 2 * ncol(X)^(1/5), TVE = 1, replace = TRUE )
X |
A dfts object or data which can be automatically converted to that
format. See |
statistic |
String for test statistic. Options are integrated ( |
critical |
String for method of determining the critical values. Options
are |
perm_method |
String for method of resampling. Options are |
M |
Numeric for number of simulation to use in determining the null distribution. Default is 1000. |
blocksize |
Numeric for blocksize in resample test. Default is |
TVE |
Numeric for total variance explained when using PCA for eigenvalues. Default is 1. |
replace |
Boolean if replacement should be used for resample test. Thus, this defines if a bootstrapped or permuted test is used. Default is TRUE. |
List with the following elements:
pvalue: p-value for the stationarity test.
statistic: test statistic from the test.
simulations: simulations which define the null distribution.
Horvath, L., Kokoszka, P., & Rice, G. (2014). Testing stationarity of functional time series. Journal of Econometrics, 179(1), 66-82.
res <- stationarity_test( generate_brownian_motion(100,v=seq(0,1,length.out=20)), critical='resample', statistic='Mn') res2 <- stationarity_test(electricity)
res <- stationarity_test( generate_brownian_motion(100,v=seq(0,1,length.out=20)), critical='resample', statistic='Mn') res2 <- stationarity_test(electricity)
General summary function to view data overview. Several plots and test statistics are returned to give a general view of the data. More details can be found with more specialized functions.
## S3 method for class 'dfts' summary(object, changes = NULL, lag.max = 20, d.max = 2, demean = FALSE, ...)
## S3 method for class 'dfts' summary(object, changes = NULL, lag.max = 20, d.max = 2, demean = FALSE, ...)
object |
A dfts object or data which can be automatically converted to that
format. See |
changes |
Vector of change locations, if there are any. Default is NULL. |
lag.max |
Max lags to consider for ACF. Default is 20. |
d.max |
Max number of dimensions for QQ-plot. |
demean |
Boolean if data should be demeaned based on changes and create plots based on these residuals. |
... |
Data to pass into underlying functions like the KPSS, portmanteau, and stationary tests. In general it is recommended to not use this and instead apply the specialized functions directly. |
List with the elements:
summary_data: summary results for the data.
summary_plot: summary plot for the data.
res <- summary(electricity[,1:20], lag.max=2)
res <- summary(electricity[,1:20], lag.max=2)
A list of daily minimum temperature data (1859 - 2012) from reporting station in Sydney, Australia. Any missing data was previously, linearly, interpolated and leap days were removed for consistency.
temperature
temperature
temperature
A dfts object.
<www.bom.gov.au>