java.lang.Object

org.hipparchus.stat.descriptive.AbstractStorelessUnivariateStatistic

org.hipparchus.stat.descriptive.rank.RandomPercentile

所有已实现的接口:: Serializable, DoubleConsumer, AggregatableStatistic<RandomPercentile>, StorelessUnivariateStatistic, UnivariateStatistic, MathArrays.Function

public class RandomPercentile extends AbstractStorelessUnivariateStatistic implements StorelessUnivariateStatistic, AggregatableStatistic<RandomPercentile>, Serializable

A StorelessUnivariateStatistic estimating percentiles using the RANDOM Algorithm.

Storage requirements for the RANDOM algorithm depend on the desired accuracy of quantile estimates. Quantile estimate accuracy is defined as follows.

Let \(X\) be the set of all data values consumed from the stream and let \(q\) be a quantile (measured between 0 and 1) to be estimated. If

\(\epsilon\) is the configured accuracy
\(\hat{q}\) is a RandomPercentile estimate for \(q\) (what is returned by getResult() or getResult(double)) with \(100q\) as actual parameter)
\(rank(\hat{q}) = |\{x \in X : x < \hat{q}\}|\) is the actual rank of \(\hat{q\) in the full data stream
\(n = |X|\) is the number of observations

then we can expect \((q - \epsilon)n < rank(\hat{q}) < (q + \epsilon)n\).

The algorithm maintains \(\left\lceil {log_{2}(1/\epsilon)}\right\rceil + 1\) buffers of size \(\left\lceil {1/\epsilon \sqrt{log_2(1/\epsilon)}}\right\rceil\). When epsilon is set to the default value of \(10^{-4}\), this makes 15 buffers of size 36,453.

The algorithm uses the buffers to maintain samples of data from the stream. Until all buffers are full, the entire sample is stored in the buffers. If one of the getResult methods is called when all data are available in memory and there is room to make a copy of the data (meaning the combined set of buffers is less than half full), the getResult method delegates to a Percentile instance to compute and return the exact value for the desired quantile. For default epsilon, this means exact values will be returned whenever fewer than \(\left\lceil {15 \times 36453 / 2} \right\rceil = 273,398\) values have been consumed from the data stream.

When buffers become full, the algorithm merges buffers so that they effectively represent a larger set of values than they can hold. Subsequently, data values are sampled from the stream to fill buffers freed by merge operations. Both the merging and the sampling require random selection, which is done using a RandomGenerator. To get repeatable results for large data streams, users should provide RandomGenerator instances with fixed seeds. RandomPercentile itself does not reseed or otherwise initialize the RandomGenerator provided to it. By default, it uses a Well19937c generator with the default seed.

注意：此实现不是线程安全的。

另请参阅:

字段概要

字段

修饰符和类型

字段

说明

static final double

DEFAULT_EPSILON

默认分位数估计误差设置
构造器概要

构造器

构造器

说明

RandomPercentile()

使用默认值（DEFAULT_EPSILON）构造一个RandomPercentile，使用默认PRNG作为随机数据源。

RandomPercentile(double epsilon)

使用epsilon作为分位数估计误差构造一个RandomPercentile，使用默认PRNG作为随机数据源。

RandomPercentile(double epsilon, RandomGenerator randomGenerator)

使用epsilon作为分位数估计误差构造一个RandomPercentile，使用randomGenerator作为随机数据源。

RandomPercentile(RandomGenerator randomGenerator)

使用默认的估计误差构造一个RandomPercentile，使用randomGenerator作为随机数据源。

RandomPercentile(RandomPercentile original)

复制构造函数，创建一个与original相同的新RandomPercentile。
方法概要

修饰符和类型

方法

说明

void

aggregate(RandomPercentile other)

将提供的实例聚合到此实例中。

void

clear()

清除统计数据的内部状态。

RandomPercentile

copy()

返回具有相同内部状态的统计数据副本。

double

evaluate(double[] values, int begin, int length)

返回使用指定数组段作为输入数据计算的中位数估计。

double

evaluate(double percentile, double[] values)

返回给定数组上的百分位数估计。

double

evaluate(double percentile, double[] values, int begin, int length)

返回使用指定数组段作为输入数据计算的给定百分位数的估计。

double

getAggregateN(Collection<RandomPercentile> aggregates)

返回已被聚合消耗的值的总数。

double

getAggregateQuantileRank(double value, Collection<RandomPercentile> aggregates)

返回聚合数据集中值的估计分位数位置。

double

getAggregateRank(double value, Collection<RandomPercentile> aggregates)

计算聚合数据集中值的估计排名。

long

getN()

返回已添加的值的数量。

double

getQuantileRank(double value)

返回数据集中值的估计分位数位置。

double

getRank(double value)

获取value的估计排名，即

double

getResult()

返回中位数的估计值。

double

getResult(double percentile)

返回给定百分位数的估计值。

void

increment(double d)

更新统计数据的内部状态，以反映新值的添加。

static long

maxValuesRetained(double epsilon)

返回使用给定epsilon值创建的RandomPercentile实例将在内存中保留的double值的最大数量。

double

reduce(double percentile, Collection<RandomPercentile> aggregates)

通过合并来自聚合集合的数据来计算给定的百分位数。

从类继承的方法 org.hipparchus.stat.descriptive.AbstractStorelessUnivariateStatistic
equals, hashCode, toString

从类继承的方法 java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait

从接口继承的方法 org.hipparchus.stat.descriptive.AggregatableStatistic
aggregate, aggregate

从接口继承的方法 java.util.function.DoubleConsumer
andThen

从接口继承的方法 org.hipparchus.stat.descriptive.StorelessUnivariateStatistic
accept, incrementAll, incrementAll

从接口继承的方法 org.hipparchus.stat.descriptive.UnivariateStatistic
evaluate

字段详细资料
- DEFAULT_EPSILON
  
  public static final double DEFAULT_EPSILON
  
  默认分位数估计误差设置
  另请参阅:
  
  常量字段值
构造器详细资料
- RandomPercentile
  
  public RandomPercentile(double epsilon, RandomGenerator randomGenerator)
  
  使用randomGenerator作为随机数据源，使用epsilon作为分位数估计误差构造一个RandomPercentile。
  
  参数:
  
  epsilon - 分位数估计误差的上限（请参阅类javadoc）
  
  randomGenerator - 用于采样和合并操作的PRNG
  
  抛出:
  
  MathIllegalArgumentException - 如果百分位数不在[0, 100]范围内
- RandomPercentile
  
  public RandomPercentile(RandomGenerator randomGenerator)
  
  使用randomGenerator作为随机数据源，使用默认的估计误差构造一个RandomPercentile。
  
  参数:
  
  randomGenerator - 用于采样和合并操作的PRNG
  
  抛出:
  
  MathIllegalArgumentException - 如果百分位数不在[0, 100]范围内
- RandomPercentile
  
  public RandomPercentile(double epsilon)
  
  使用epsilon作为分位数估计误差，使用默认PRNG作为随机数据源构造一个RandomPercentile。
  
  参数:
  
  epsilon - 分位数估计误差的上限（请参阅类javadoc）
  
  抛出:
  
  MathIllegalArgumentException - 如果百分位数不在[0, 100]范围内
- RandomPercentile
  
  public RandomPercentile()
  
  使用默认值（DEFAULT_EPSILON）构造一个RandomPercentile，使用默认PRNG作为随机数据源。
- RandomPercentile
  
  public RandomPercentile(RandomPercentile original)
  
  复制构造函数，创建一个与original相同的新RandomPercentile。注意：新实例使用的RandomGenerator是引用，而不是副本 - 即，新实例与原始实例共享一个生成器。
  
  参数:
  
  original - 要复制的PSquarePercentile实例
方法详细资料
- getN
  
  public long getN()
  
  从接口复制的说明: StorelessUnivariateStatistic
  
  返回已添加的值的数量。
  
  指定者:
  
  getN 在接口中 StorelessUnivariateStatistic
  
  返回:
  
  值的数量。
- evaluate
  
  public double evaluate(double percentile, double[] values, int begin, int length) throws MathIllegalArgumentException
  
  返回使用指定数组段作为输入数据计算的给定百分位数的估计值。
  
  参数:
  
  percentile - 所需的百分位数（缩放为0-100）
  
  values - 输入数据的来源
  
  begin - 要包括的值数组的第一个元素的位置
  
  length - 要包括的数组元素数量
  
  返回:
  
  估计的百分位数
  
  抛出:
  
  MathIllegalArgumentException - 如果百分位数超出范围[0, 100]
- evaluate
  
  public double evaluate(double[] values, int begin, int length)
  
  返回使用指定数组段作为输入数据计算的中位数的估计值。
  指定者:
  
  evaluate 在接口中 MathArrays.Function
  
  指定者:
  
  evaluate 在接口中 StorelessUnivariateStatistic
  
  指定者:
  
  evaluate 在接口中 UnivariateStatistic
  
  返回:
  
  估计的百分位数
  
  抛出:
  
  MathIllegalArgumentException - 如果百分位数超出范围[0, 100]
  
  另请参阅:
  
  UnivariateStatistic.evaluate(double[], int, int)
- evaluate
  
  public double evaluate(double percentile, double[] values)
  
  返回在给定数组上估计的百分位数。
  
  参数:
  
  percentile - 所需的百分位数（缩放为0-100）
  
  values - 输入数据的来源
  
  返回:
  
  估计的百分位数
  
  抛出:
  
  MathIllegalArgumentException - 如果百分位数超出范围[0, 100]
- copy
  
  public RandomPercentile copy()
  
  从类复制的说明: AbstractStorelessUnivariateStatistic
  
  返回具有相同内部状态的统计量的副本。
  
  指定者:
  
  copy 在接口中 StorelessUnivariateStatistic
  
  指定者:
  
  copy 在接口中 UnivariateStatistic
  
  指定者:
  
  copy 在类中 AbstractStorelessUnivariateStatistic
  
  返回:
  
  统计量的副本
- clear
  
  public void clear()
  
  从类复制的说明: AbstractStorelessUnivariateStatistic
  
  清除统计量的内部状态
  
  指定者:
  
  clear 在接口中 StorelessUnivariateStatistic
  
  指定者:
  
  clear 在类中 AbstractStorelessUnivariateStatistic
- getResult
  
  public double getResult()
  
  返回中位数的估计值。
  
  指定者:
  
  getResult 在接口中 StorelessUnivariateStatistic
  
  指定者:
  
  getResult 在类中 AbstractStorelessUnivariateStatistic
  
  返回:
  
  统计量的值，如果已清除或刚实例化，则为Double.NaN。
- getResult
  
  public double getResult(double percentile)
  
  返回给定百分位数的估计值。
  
  参数:
  
  percentile - 所需的百分位数（缩放为0-100）
  
  返回:
  
  估计的百分位数
  
  抛出:
  
  MathIllegalArgumentException - 如果百分位数超出范围[0, 100]
- getRank
  
  public double getRank(double value)
  
  获取数据流中小于value的估计排名，即\( |\{x \in X : x < value\}| \)，其中\(X\)是从流中消耗的值的集合。
  
  参数:
  
  value - 寻找其整体排名的值
  
  返回:
  
  估计的严格小于value的样本值数量
- getQuantileRank
  
  public double getQuantileRank(double value)
  
  返回数据集中值value的估计分位位置。具体来说，返回的是\( |\{x \in X : x < value\}| / |X| \)的估计值，其中\(X\)是从流中消耗的值的集合。
  
  参数:
  
  value - 寻找其分位等级的值。
  
  返回:
  
  估计的严格小于value的样本值比例
- increment
  
  public void increment(double d)
  
  从类复制的说明: AbstractStorelessUnivariateStatistic
  
  更新统计量的内部状态，以反映新值的添加。
  
  指定者:
  
  increment 在接口中 StorelessUnivariateStatistic
  
  指定者:
  
  increment 在类中 AbstractStorelessUnivariateStatistic
  
  参数:
  
  d - 新值。
- reduce
  
  public double reduce(double percentile, Collection<RandomPercentile> aggregates)
  
  通过组合来自聚合集合的数据来计算给定百分位数。结果描述了添加到任何聚合中的所有数据的组合样本。
  
  参数:
  
  percentile - 所需的百分位数（缩放为0-100）
  
  aggregates - 要从中组合数据的RandomPercentile实例
  
  返回:
  
  使用来自聚合中的组合数据的给定百分位数的估计值
  
  抛出:
  
  MathIllegalArgumentException - 如果百分位数超出范围[0, 100]
- getAggregateRank
  
  public double getAggregateRank(double value, Collection<RandomPercentile> aggregates)
  
  计算聚合数据集中值value的估计排名。将来自getRank(double)的值求和。
  
  参数:
  
  value - 寻找其排名的值
  
  aggregates - 要在其上聚合排名的集合
  
  返回:
  
  估计的组合数据集中小于value的元素数量
- getAggregateQuantileRank
  
  public double getAggregateQuantileRank(double value, Collection<RandomPercentile> aggregates)
  
  返回在所有数据流中消耗的值集合中值value的估计分位位置。具体来说，返回的是\( |\{x \in X : x < value\}| / |X| \)的估计值，其中\(X\)是从所有数据流中消耗的值的集合。
  
  参数:
  
  value - 寻找其分位等级的值。
  
  aggregates - 正在组合的RandomPercentile实例的集合
  
  返回:
  
  估计的组合样本值中严格小于value的比例
- getAggregateN
  
  public double getAggregateN(Collection<RandomPercentile> aggregates)
  
  返回聚合中已消耗的值的总数。
  
  参数:
  
  aggregates - 要求其组合样本大小的RandomPercentile实例的集合
  
  返回:
  
  聚合已消耗的值的总数
- aggregate
  
  public void aggregate(RandomPercentile other) throws NullArgumentException
  
  将提供的实例聚合到此实例中。
  其他必须具有与此相同的缓冲区大小。如果组合数据大小超过为此实例配置的最大存储容量，则合并缓冲区以创建容量。如果所需的仅是计算聚合结果，则reduce(double, Collection)更快，可能更准确，并且不需要缓冲区大小相同。
  
  指定者:
  
  aggregate 在接口中 AggregatableStatistic<RandomPercentile>
  
  参数:
  
  other - 聚合到此实例中的实例
  
  抛出:
  
  NullArgumentException - 如果输入为null
  
  IllegalArgumentException - 如果other的缓冲区大小与此实例不同
- maxValuesRetained
  
  public static long maxValuesRetained(double epsilon)
  
  返回使用给定epsilon值创建的RandomPercentile实例将在内存中保留的double值的最大数量。
  如果从流中消耗的值数量少于此值的1/2，则报告的统计数据是精确的。
  
  参数:
  
  epsilon - 相对分位数误差的上界（请参阅类文档）
  
  返回:
  
  在内存中保留的原始double值总数的上限
  
  抛出:
  
  MathIllegalArgumentException - 如果epsilon不在区间（0,1）内

类 RandomPercentile

字段概要

构造器概要

方法概要

从类继承的方法 org.hipparchus.stat.descriptive.AbstractStorelessUnivariateStatistic

从类继承的方法 java.lang.Object

从接口继承的方法 org.hipparchus.stat.descriptive.AggregatableStatistic

从接口继承的方法 java.util.function.DoubleConsumer

从接口继承的方法 org.hipparchus.stat.descriptive.StorelessUnivariateStatistic

从接口继承的方法 org.hipparchus.stat.descriptive.UnivariateStatistic

字段详细资料

DEFAULT_EPSILON

构造器详细资料

RandomPercentile

RandomPercentile

RandomPercentile

RandomPercentile

RandomPercentile

方法详细资料

getN

evaluate

evaluate

evaluate

copy

clear

getResult

getResult

getRank

getQuantileRank

increment

reduce

getAggregateRank

getAggregateQuantileRank

getAggregateN

aggregate

maxValuesRetained