Variational Non-Bayesian Inference of the Probability Density Function: Organization and Notation

19 Apr 2024

This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.


(1) U Jin Choi, Department of mathematical science, Korea Advanced Institute of Science and Technology &;

(2) Kyung Soo Rim, Department of mathematics, Sogang University &

2. Organization and notation.

This article is organized as follows: In the next section, we formulate the problem by defining the energy and partition function for the hidden PDF in a generalized Wiener algebra. We observe that the nonlinearity of energy functions poses challenges in maximizing likelihood functions. To overcome this issue, we adopt the KLdivergence between two distribution functions.

In Section 4, we embed energy functions into a generalized Wiener algebra, thereby transforming the nonlinearity of the energy function into a system of equations in the function space and simultaneously converting the min-max problem into a problem of solving equations. We showthe Fréchet differentiability of the KL-divergence with respect to coefficients in the generalized Wiener algebra. From this property, we then uniquely obtain the existence such that its energy function yields a KL-divergence of 0.

We establish a characterization of the coefficients of the energy function by expressing them in relation to the Fourier coefficients found within the (classical) Wiener algebra, as demonstrated in Section 5. This serves as an example of the generalized Wiener algebra.

In the concluding Section 6, when dealing with a random sample that consists of realizations of a stationary ergodic process, we formulate a system of polynomial series with the coefficients of the energy function as variables, which consists of infinitely many equations in infinitely many variables. From the convergence property of the partial sums of an energy function in Section 4, the truncated system, consisting of a finite number of equations with partial sums, offers an avenue for coefficient approximations. The estimated coefficients effectively serve as estimators for the coefficients. More precisely, the hidden PDF can be approximated using these estimators. Furthermore, we provide a numerical example that emphasizes the approximation of the PDF, as well as the calculation of the mean and variance from the PDF using a random sample. This sample is generated from a bivariate normal distribution.

In this paper, we consistently employ the following general notations (Table 1). Some notations will be elaborated upon in more detail when they are introduced.