Check input data, interpolate NA values in y, remove spike values, and set weights for NA in y and w.

check_input(
  t,
  y,
  w,
  QC_flag,
  nptperyear,
  south = FALSE,
  wmin = 0.2,
  wsnow = 0.8,
  ymin,
  missval,
  maxgap,
  alpha = 0.02,
  alpha_high = NULL,
  date_start = NULL,
  date_end = NULL,
  mask_spike = TRUE,
  na.rm = FALSE,
  ...
)

Arguments

t

Numeric vector, Date variable

y

Numeric vector, vegetation index time-series

w

(optional) Numeric vector, weights of y. If not specified, weights of all NA values will be wmin, the others will be 1.0.

QC_flag

Factor (optional) returned by qcFUN, levels should be in the range of c("snow", "cloud", "shadow", "aerosol", "marginal", "good"), others will be categoried into others. QC_flag is used for visualization in get_pheno() and plot_curvefits().

nptperyear

Integer, number of images per year.

south

Boolean. In south hemisphere, growing year is 1 July to the following year 31 June; In north hemisphere, growing year is 1 Jan to 31 Dec.

wmin

Double, minimum weight of bad points, which could be smaller the weight of snow, ice and cloud.

wsnow

Doulbe. Reset the weight of snow points, after get ylu. Snow flag is an important flag of ending of growing season. Snow points is more valuable than marginal points. Hence, the weight of snow should be great than that of marginal.

ymin

If specified, ylu[1] is constrained greater than ymin. This value is critical for bare, snow/ice land, where vegetation amplitude is quite small. Generally, you can set ymin=0.08 for NDVI, ymin=0.05 for EVI, ymin=0.5 gC m-2 s-1 for GPP.

missval

Double, which is used to replace NA values in y. If missing, the default vlaue is ylu[1].

maxgap

Integer, nptperyear/4 will be a suitable value. If continuous missing value numbers less than maxgap, then interpolate those NA values by zoo::na.approx; If false, then replace those NA values with a constant value ylu[1].
Replacing NA values with a constant missing value (e.g. background value ymin) is inappropriate for middle growing season points. Interpolating all values by na.approx, it is unsuitable for large number continous missing segments, e.g. in the start or end of growing season.

alpha

Double, in [0,1], quantile prob of ylu_min.

alpha_high

Double, [0,1], quantile prob of ylu_max. If not specified, alpha_high=alpha.

date_start, date_end

starting and ending date of the original vegetation time-sereis (before add_HeadTail)

mask_spike

Boolean. Whether to remove spike values?

na.rm

Boolean. If TRUE, NA and spike values will be removed; otherwise, NA and spike values will be interpolated by valid neighbours.

...

Others will be ignored.

Value

A list object returned:

  • t : Numeric vector

  • y0: Numeric vector, original vegetation time-series.

  • y : Numeric vector, checked vegetation time-series, NA values are interpolated.

  • w : Numeric vector

  • Tn: Numeric vector

  • ylu: = [ymin, ymax]. w_critical is used to filter not too bad values.

    If the percentage good values (w=1) is greater than 30\

    The else, if the percentage of w >= 0.5 points is greater than 10\ w_critical=0.5. In boreal regions, even if the percentage of w >= 0.5 points is only 10\

    We can't rely on points with the wmin weights. Then,
    y_good = y[w >= w_critical],
    ymin = pmax( quantile(y_good, alpha/2), 0)
    ymax = max(y_good).

Examples

data("CA_NS6")
d = CA_NS6
# head(d)

nptperyear = 23
INPUT <- check_input(d$t, d$y, d$w, QC_flag = d$QC_flag,
     nptperyear = nptperyear, south = FALSE, 
     maxgap = nptperyear/4, alpha = 0.02, wmin = 0.2)
plot_input(INPUT)