Getting Started

Installation

The package can be installed from the repl using

(v1.7) pkg> add https://github.com/JakeGrainger/HistoricalStormTraceSimulation.jl

There are two main interface functions which should be used:

Sample Traces

HistoricalStormTraceSimulation.sampletraces — Function

sampletraces(new_summaries, historical_summaries, historical_traces; samplemethod=1:50, rescalemethod)

Sample new traces given summaries based on modifications of historical traces.

Arguments

new_summaries: Vector of summaries to generate traces for.
history: Storm history information of type StormHistory. Best constructed using dataframes2storms function.
samplemethod: Method for sampling from closest points. Passing 1:m will sample uniformly from the closest m points. Defaults to 1:50. Could also be a Distribution. Note that if a Distribution is used, then it should be discrete, and should be defined on 1:n where n is the number of historical storms.
rescalemethod: Tuple of methods for rescaling (one for each column of the trace). Should be a subtype of type RescaleMethod.
summarymetric: A metric for determining closeness of storm summaries (must be subtype of Metric). Default is Euclidean(). Note care should be taken when dealing with directions, in this case, use PeriodicEuclidean or WeightedPeriodicEuclidean with the appropriate period choices.
interpolation_method: Method for performing interpolation. LinearInterpolation is the default, but CubicSplineInterpolation may be preferable in some contexts (though it is much slower).

RescaleMethods

RescaleIdentity(): The identity (no rescale).
RescaleMean(): Rescale the mean to match provided mean.
RescaleMaxChangeMin(): Rescale the maximum to match provided maximum, using linear scaling and changing the minimum.
RescaleMaxPreserveMin(): Rescale the maximum to match provided maximum, using linear scaling but preserving the minimum

source

This is the main function used to generate traces from historical storms. This package provides an additional metric for periodic Weighted Euclidean distance:

HistoricalStormTraceSimulation.WeightedPeriodicEuclidean — Type

WeightedPeriodicEuclidean(p,w)

Create a weighted Euclidean metric on a rectangular periodic domain (i.e., a torus or a cylinder). Periods per dimension are contained in the vector p, and weights in w:

\[\sqrt{\sum_i w_i*(\min\mod(|x_i - y_i|, p_i), p_i - \mod(|x_i - y_i|, p_i))^2}.\]

Based on the PeriodicEuclidean from Distances.jl.

source

Converting Data to Internal Types

This package uses its own types to conveniently represent the concepts it deals with. In particular, types for representing storm traces and summaries. Data of this kind usually comes from outputs of other software, which usually is in the form of named data frames. The dataframes2storms function converts such data frames to the correct format for the package.

HistoricalStormTraceSimulation.dataframes2storms — Function

dataframes2storms(event_data, event_start_end, input_data, simulated_data)

Convert dataframes containing storm parameters and data to traces and summaries for use in package.

Will reorder variables to match up names of variables. Pass outputs to sampletraces function.

Arguments:

event_data - DataFrame containing summaries of historical storms.
event_start_end - DataFrame containing start and end indices of events in input_data.
input_data - DataFrame containing historical time series.
simulated_data - DataFrame containing simulated storm summaries.

Outputs:

new_summaries - Vector of summary vectors.
history - StormHistory object.
summary_names - Names of summary variables in order (traces are the same but one less variable (time is separate)).

source

The arguments event_data and simulated data are the storm summaries, input_data is the complete time series (including the non-extreme time periods), and event_start_end is a data frame with column 1 containing the start index for a storm, and column 2 containing the end index. Note that this should be 1-indexed, not 0-indexed!

Note that it is important that naming conventions are consistent across data frames for this function to work.