Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Distributions of the Mean for Lognormally Distributed Data - 19427

Conference ·
OSTI ID:23005326
 [1]
  1. Neptune and Company, Inc. (United States)
In environmental sciences, a distribution of the mean is often needed for right-skewed data. For example, risk assessments are often based on distributions of the mean to obtain upper confidence limits (UCLs) for analyte concentrations, and, in probabilistic performance assessment (PPA) models, parameters are given distributions rather than deterministic values, and they are usually given distributions of their mean rather than the population distributions to account for the large spatial and temporal domains to which they are applied. How this distribution of the mean is developed depends on the characteristics of the parameter's full distribution and how the parameter relates to its use. In this presentation, I will discuss the implications of a right skew in distribution development. Parameters may vary over orders of magnitude and/or may be right-skewed. The material property of saturated hydraulic conductivity, Ks, is a prime example of such a parameter. Right-skewed parameters are often assumed to follow lognormal distributions. If X follows a lognormal distribution, then log(X) follows a normal distribution. For some right-skewed parameters, practitioners go back and forth between the two forms of the data depending on their needs. For example, it may be preferred to use the untransformed version of X in a model or equation, but it is easier to visualize the parameter values on a logarithmic scale. However, distribution development for the average value of the parameter will differ depending on whether X or log(X) is modeled. This difference is due to Jensen's Inequality, a theorem which states that the expected value of a concave function of a variable is less than or equal to the same function of the expected value of the variable; a logarithm is a concave function. In other words, the mean of log(X) is always less than or equal to the log of the mean of X and therefore the mean estimated from a distribution based on log(X) is less than or equal to the mean estimated from a distribution based on X. If a parameter is naturally considered on a log scale by subject matter experts, then distributions developed for the mean of log(X) may be more consistent with the use of the parameters in the literature than distributions developed for X. The statistical decisions of distribution development will be discussed within the context of right-skewed, lognormal data. These issues include method-of-moments versus maximum likelihood estimation of the distribution parameters and the use of bootstrapping to aid in the visualization of a distribution of the mean. The details surrounding Jensen's Inequality will be explained and visualized within the context of distribution development of a right-skewed parameter such as saturated hydraulic conductivity. (authors)
Research Organization:
WM Symposia, Inc., PO Box 27646, 85285-7646 Tempe, AZ (United States)
OSTI ID:
23005326
Report Number(s):
INIS-US--21-WM-19427
Country of Publication:
United States
Language:
English