A good representation is one in which the data has a distribution that is easy to model

基础数学

生成模型要解决的问题

给定两组数据 $z$ 和 $x$ ,其中 $z$ 服从已知的简单先验分布 $\pi(z)$ ,通常是高斯分布,$x$ 服从复杂的分布 $p(x)$ ,即训练数据代表的分布,现在我们想要找到一个变换函数 $f$ ,它能建立一种 $z$ 到 $x$ 的映射,使得每对于 $\pi(z)$ 中的一个采样点,都能在 $p(x)$ 中有一个(新)样本点与之对应。

$$ p_g(x) = \int_z p(x|z)p(z)dz\ $$

其中,$ p(x|z) $ - $ the \ probability \ of \ x \ given \ z$

概率分布的变换数学例题

设随机变量X具有概率密度

求随机变量 $Y=2X+8$ 的概率密度

Jacobian Matrix

$$
\left[
\begin{array}{ccc}
\frac{\partial f_1 }{\partial x_1 } & \cdots & \frac{\partial f_1 }{\partial x_n } \\
\vdots\quad & \ddots & \vdots\quad \\
\frac{\partial f_n }{\partial x_1 } & \cdots & \frac{\partial f_n }{\partial x_n } \\
\end{array}
\right]
$$

$$J_{ij}=\frac{\partial f_i }{\partial x_j }$$

Determinant

耦合层(Coupling Layer)

NICE

We propose a deep learning framework for modeling complex high-dimensional densities called Non-linear Independent Component Estimation (NICE). It is based on the idea that a good representation is one in which the data has a distribution that is easy to model.

For this purpose, a non-linear deterministic transformation of the data is learned that maps it to a latent space so as to make the transformed data conform to a factorized distribution, i.e., resulting in independent latent variables. We parametrize this transformation so that computing the determinant of the Jacobian and inverse Jacobian is trivial, yet we maintain the ability to learn complex non-linear transformations, via a composition of simple building blocks, each based on a deep neural network. The training criterion is simply the exact log-likelihood, which is tractable.

Unbiased ancestral sampling is also easy. We show that this approach yields good generative models on four image datasets and can be used for inpainting.

RealNVP

Unsupervised learning of probabilistic models is a central yet challenging problem in machine learning. Specifically, designing models with tractable learning, sampling, inference and evaluation is crucial in solving this task.

We extend the space of such models using real-valued non-volume preserving (real NVP) transformations, a set of powerful, stably invertible, and learnable transformations, resulting in an unsupervised learning algorithm with exact log-likelihood computation, exact and efficient sampling, exact and efficient inference of latent variables, and an interpretable latent space.

We demonstrate its ability to model natural images on four datasets through sampling, log-likelihood evaluation, and latent variable manipulations.

“The advantage of Real NVP compared to MAF and IAF is that it can both generate data and estimate densities with one forward pass only, whereas MAF would need D passes to generate data and IAF would need D passes to estimate densities.”

Glow

Flow-based generative models (Dinh et al., 2014) are conceptually attractive due to tractability of the exact log-likelihood, tractability of exact latent-variable inference, and parallelizability of both training and synthesis. 

In this paper we propose Glow, a simple type of generative flow using an invertible $1 \times 1$ convolution. Using our method we demonstrate a significant improvement in log-likelihood on standard benchmarks. Perhaps most strikingly, we demonstrate that a generative model optimized towards the plain log-likelihood objective is capable of efficient realisticlooking synthesis and manipulation of large images.

Normalizing Flows for Probabilistic Modeling and Inference

Normalizing flows provide a general mechanism for defining expressive probability distributions, only requiring the specification of a (usually simple) base distribution and a series of bijective(双射) transformations.

There has been much recent work on normalizing flows, ranging from improving their expressive power to expanding their application. We believe the field has now matured and is in need of a unified perspective.

In this review, we attempt to provide such a perspective by describing flows through the lens of probabilistic modeling and inference.

We place special emphasis on the fundamental principles of flow design, and discuss foundational topics such as expressive power and computational trade-offs. We also broaden the conceptual framing of flows by relating them to more general probability transformations. Lastly, we summarize the use of flows for tasks such as generative modeling, approximate inference, and supervised learning.

相关论文

NICE: Non-linear Independent Components Estimation, Dinh et al. 2014

Variational Inference with Normalizing Flows, Rezende and Mohamed 2015

Density estimation using Real NVP, Dinh et al. May 2016

Improved Variational Inference with Inverse Autoregressive Flow, Kingma et al June 2016

Masked Autoregressive Flow for Density Estimation, Papamakarios et al. May 2017

Glow: Generative Flow with Invertible 1x1 Convolutions, Kingma and Dhariwal, Jul 2018

Normalizing Flows for Probabilistic Modeling and Inference. 2019

参考