Chapter 1: Introduction and Background

Justin Dunnam

Quantiles are commonly used for summarizing the relative location of data within a set according to their magnitude. The quantiles, also known as percentiles, are useful in quality management for summarizing the data, setting quality goals, and grading performance by illustrating the position of data points relative to the rest of the data set. Comparing populations is an important problem in statistics and is commonly studied by comparing the means of the groups. When comparing populations in a medical context, it is essential to consider not just the means but also other aspects of the distribution, such as percentiles, which can provide deeper insights into patient responses. For example, a statistical test on time to pain relief indicates that a new drug has a smaller third quartile than the standard, it suggests that a larger proportion of patients experience significant pain relief than those on the standard treatment.

Estimates of quantiles are useful to assess the percentage of population falls below or above a threshold value. For example, a 100 $(1-\alpha)$ % upper confidence limit for the 100 $p$ th percentile of a population is called $(p,1-\alpha)$ upper tolerance limit. Typically $\alpha$ takes values in $\{0.01,0.05,0.10\}$ and $p$ in $\{0.80,0.90,0.95,0.99\}$ . See the article by [owen1968survey], the book by [krishnamoorthy2009statistical] and [young2010tolerance]. One-sided confidence limits for quantiles are referred to as the one-sided tolerance limits. For example, if $L(X;p,1-\alpha)$ denotes the $(p,1-\alpha)$ lower tolerance limit for a population, then we infer that at least $100p$ % of the population is greater than $L(X;p,1-\alpha)$ with confidence $1-\alpha$ . For two-sample problems, several authors have noted that comparison of a particular percentile of two populations in many practical problems; for example, see [cox1985testing], [albers1984approximate] and [rudolfer1985large]. Comparison of upper quantiles of two groups arises in a pathological tremors study ([albers1984approximate], medical diagnosis ([rudolfer1985large]), and the wood industry ([aplin1986moisture] and [huang2006confidence]). [malekzadeh2023simultaneous] have discussed the applications of simultaneous confidence intervals (CIs) for the differences of quantiles of several normal distributions in vitamin D supplement treatment in colorectal cancer study. As highlighted by [li2012comparison], the means of different distributions might be similar, but their tails can differ significantly. Thus, in addition to the standard comparisons of means among groups, comparing other population parameters such as the first quartile (25th percentile) and the 3rd quartile (75th percentile) of groups is also important. For example, if sample data indicate that the 75th percentile of duration of treatment A is smaller than that for treatment B, then we infer that at least 75% of patients with treatment A take less treatment time than those take treatment B. Despite the importance and practical applications, there has not yet been much attempt for the inferences of comparing two or more quantiles.

If a test for equality of quantiles rejects the null hypothesis, it becomes important to identify which differences led to the rejection; in such cases, simultaneous confidence intervals are constructed for pairwise differences, and any interval that does not include zero indicates a significant difference between the corresponding quantiles. Conversely, if the test accepts the null hypothesis, the focus shifts to estimating the common quantile shared by all populations under study, a problem that falls within the broader scope of meta-analysis concerning the estimation of a common parameter across multiple populations.

Meta-analysis is a statistical technique used to combine results from multiple studies addressing the same hypothesis. It is particularly valuable when individual studies yield inconclusive results or when full data sets are unavailable due to privacy or proprietary constraints. By aggregating evidence, meta-analysis improves statistical power while controlling the type I error rate, making it essential in many fields, such as genetics and systems biology. In hypothesis testing, the results of studies are often summarized as test statistics and associated p-values. When combining multiple p-values from independent studies, the goal is to enhance the overall statistical power. There are various methods for combining independent p-values, with contributions dating back to the 1930s. Some of the most commonly used methods include Fisher’s method, Stouffer’s method and Tippet’s method. All these methods combine the p-values of the individual independent tests so that the combined p-values follow a distribution that does not depend on any parameter.

1. Review of Literature

The problem of estimating/testing the difference between percentiles of two independent normal distributions has received some attention in the literature. [bristol1990distribution] has provided a nonparametric approach to compare two quantiles. [cox1985testing] have proposed a large sample test for the equality of two normal percentiles. [guo2005comparison] have proposed a fiducial test and CIs for the difference between two normal quantiles. An exact method of computing CIs for the ratio of normal percentiles is proposed in [huang2006confidence]. However, their CIs are valid only when the variances are equal, not in closed-form and an iterative method is required to find them. Recently, [krishnamoorthy2024conf_sap] have proposed approximate closed-form CIs for the ratio of percentiles for the normal, exponential and Weibull cases. Their CIs for the ratio of percentiles involving these distributions are accurate and easy to compute.

The paper by [li2012comparison] seems to be the first one considered the problem of testing equality of quantiles of several normal populations. These authors have provided a test based on the generalized variable (fiducial) approach. This test seems to be satisfactory, but appears to be conservative. [abdollahnezhad2018testing] developed the likelihood ratio test (LRT) and a modified version of the LRT. This test is simple and easy to use. [malekzadeh2023simultaneous] have proposed simultaneous CIs for quantile differences of several normal populations. Some of their methods are involved numerically, but the other methods based on the bootstrap approach and fiducial approach are conceptually simple and quite satisfactory.

As noted earlier, if a test for equality of quantiles of several normal populations accepts the null hypothesis, then it is desired to estimate the common quantile of all the populations under study. Estimation of such common parameter of several populations is a special research topic in the area of meta-analysis. Meta-analysis is a statistical technique used to combine results from multiple studies addressing the same hypothesis. In hypothesis testing, the results of studies are often summarized as test statistics and associated p-values. When combining multiple p-values from independent studies, the goal is to enhance the overall statistical power. There are various methods for combining independent p-values, with contributions dating back to the 1930s. Fisher’s ([fisher1930statistical]) method combines log-transformed p-values, so that the combination of the p-values follows a $\chi^{2}$ distribution with degrees of freedom (df) depending on the number of independent samples. Stouffer’s ([stouffer1949american]) approach combines p-values using the sum of the z-scores derived from the individual p-values. [liptak1958combination] has enhanced the Stouffer’s approach by proposing a weighted sum of the z-scores. Tippett’s ([tippett1931methods]) method takes the minimum p-value from a set of independent tests. [koziol1978combining] have obtained a combined test based on independent chi-squared scores. Recent studies by [krishnamoorthy2024combining] and [krishnamoorthy2024confidence] indicated that there is no clear-cut winner among all the combined tests proposed in the literature. In general, Liptak’s test based on a weighted sum of z-scores and the combined test based on the chi-squared scores exhibit better power properties than other combined tests.

Even though there are several papers that have addressed the problem of testing a common parameter of interest, only a few papers considered the problems of interval estimating a common parameter in various setups. [fairweather1972method] and [jordan1996exact] have provided CIs for a common mean of several normal populations based on pivotal-based approaches. Tian and her co-authors have proposed CIs for a common correlation coefficient, a common CV and a common mean of several lognormal populations; see [tian2005inferences], [tian2007inferences] and [tian2008confidence]. Most of the CIs proposed in the literature for a common parameter of interest are approximate and not simple to compute. Recently, [krishnamoorthy2024combining] have proposed a numerical method for inverting a combined test to find a CI for a parameter of interest. They applied the approach to find CIs for a common mean of several normal populations, common coefficient of variation of normal populations and common correlation coefficient of several bivariate normal populations. To the best of our knowledge, the problems of testing or interval estimating a common quantile of serval populations have never been addressed in the literature.

2. Research Problems

In many practical situations, it is desired to compare the quantiles of two populations. Comparison of quantiles can be made using a hypothesis test or a CI for the difference or the ratio of population quantiles. In some clinical studies, it is of interest to compare the quantiles of several experimental results or populations. If a test for comparison of several quantiles indicates that the population quantiles are not significantly different, then a problem of interest is to estimate or test the common quantile of all the populations under study. In this dissertation, we address the following research problems.

(1)

Hypothesis tests and CIs for comparing quantiles of two normal populations.
(2)

Developing some improved tests for comparing quantiles of several normal populations. Construction of simultaneous CIs for quantile differences of several normal populations.
(3)

Construction of CIs for a common quantile of several normal populations.

3. Organization of the Dissertation

Two Sample Problem: In Chapter LABEL:ch2, we address the problems of interval estimating and hypothesis testing in one- and two-sample problems. In one-sample problem, a well-known CI for a quantile of a normal distribution is the classical CI based on the noncentral $t$ (NCT) distribution. [chakraborti2007confidence] have proposed a CI for a normal quantile which is based on a minimum variance unbiased estimate (MVUE) of the population quantile. Some authors presumed that the Chakraborti-Li CI is different or better than the classical NCT CI, and proposed methods of computing critical values to find the Chakraborti-Li CI; see the articles by [liu2013simultaneous], [zhang2018confidence], [shieh2020comparison], and [malekzadeh2023simultaneous]. We show in this chapter that the classical CI and the Chakraborti-Li CI are the same. Furthermore, we propose a simple approximate CI for a normal quantile based on a normal approximation to the NCT distribution. This approximate CI is straightforward to compute and is quite comparable with the classical NCT CI.

We also use the fiducial approach to find CIs for the difference/ratio of two normal percentiles. We describe the fiducial distribution for a quantile based on the NCT distribution and another one based on the normal approximation to the NCT distribution. Using these fiducial distributions, we develop simple closed-form fiducial CIs for a ratio/difference of quantiles of two normal populations. The proposed approach can be readily extended to find CIs for a ratio of two lognormal quantiles or for a ratio of two gamma quantiles. Coverage, precision and comparison studies are carried out using Monte Carlo simulation. Three examples, involving normal, gamma and lognormal distributions, are worked out in the example section of Chapter LABEL:ch2.

Multi-Sample Problem: In Chapter LABEL:ch3, we consider the problem of testing the equality of quantiles of several normal populations. We first describe the generalized variable test (GVT) by [li2012comparison]. We have enhanced the GVT by deriving theoretical expressions of some quantities. These closed-form expressions are easy to compute and thereby avoid additional simulation used in [li2012comparison]. Then we describe the modified likelihood ratio test (MLRT) proposed in [abdollahnezhad2018testing] and a new MLRT. We also outline a parametric bootstrap (PB) approach for testing the equality of the normal quantiles. To further investigate the nature of differences among populations when the null hypothesis is rejected, it is often necessary to conduct pairwise comparisons of quantiles. We here propose Bonferroni CIs based on an accurate two-sample CI given in Chapter LABEL:ch2. We evaluate and compare the tests in terms of type I error rates and powers. Furthermore, we see that the pairwise fiducial CIs and parametric bootstrap CIs based on the classical noncentral pivotal quantity for a quantile and those proposed in [malekzadeh2023simultaneous] are essentially the same. These pairwise CIs are evaluated and compared using Monte Carlo simulation. We illustrate the tests and simultaneous CIs using two examples with real data.

Common Quantile: Even though there are several papers that have addressed the testing problems of combining independent tests for various purposes, only a few papers considered the problems of interval estimating a common parameter of interest in various setups. It should be noted here that a CI for a parameter provides more information than the results of a hypothesis test. In Chapter LABEL:ch4, we consider the problem of estimating a common quantile of several normal distributions. In particular, we address interval estimation of a common quantile of several normal populations by inverting a combined test for the common quantile. We outline a one-sample test for quantile based on the NCT distribution and an approximate test based on a normal approximation to the NCT distribution. We describe some combined tests obtained by combining the individual tests. In particular, we describe Fisher’s test, inverse normal test and inverse $\chi^{2}$ test. An algorithm is provided to obtain CIs by inverting the combined tests. In addition, we propose a fiducial approach to find a simple closed-form CI for a common quantile. We also present simulation studies of coverage probabilities and interval precision, followed by two practical examples illustrating the proposed methods.

In Chapter LABEL:ch5, we provide some concluding remarks and some future research work on multiple comparison of population quantiles. This problem arises when a test for equality of quantiles rejects the null hypotheses. In such situations, it is of interest to find the quantiles that are significantly different.