Generate a new synthetic RDD whose rows are iid sampled from input feature vectors
Generate a new synthetic RDD whose rows are iid sampled from input feature vectors
The number of iid samples to generate.
The input sample size. Input is periodically sampled and the sample is used to generate iid output data. Defaults to 10000.
The output sample size. Each input sample is used to generate this number of output samples. Defaults to 10000.
An RDD of FeatureSeq where each 'column' in the feature sequence is statistically independent of the others, but shares the marginal distribution of the corresponding input column.
Interface for enriched iid feature sampling methods on sequence-like collections of feature vectors. A feature vector is some sequential collection of Double values, whose values may be iid sampled to generate new synthetic feature vectors having the same marginal distributions as the input features. The underlying feature vector representation is not assumed by this interface; multiple representations might be supported