Optimal estimation

The core expression of optimal estimation in ARTS follows from Rodgers [27]. He formulates the core expression of a measurement as

\[\vec{y} = F\left(\vec{x}\right) + \epsilon,\]

where \(\vec{y}\) is a measurement vector (i.e., measurement_vec in the ARTS workspace), \(\vec{x}\) is the state of the model (i.e., model_state_vec in the ARTS workspace), \(F\) is the model (i.e., ARTS itself), and \(\epsilon\) is some measurement error that cannot be modelled.

The forward model result from this is just

\[\vec{y}_f = F\left(\vec{x}\right).\]

The goal of the optimal estimation is to find an \(\vec{x}\) that minimizes some cost-function, which in ARTS is defined as

\[\chi^2 = \chi^2_y + \chi^2_x,\]

where

\[ \begin{align}\begin{aligned}\chi^2_y = \frac{1}{m} \left(\vec{y}-\vec{y}_f\right)^\top \mathbf{S}_\epsilon^{-1} \left(\vec{y}-\vec{y}_f\right).\\\chi^2_x = \frac{1}{m} \left(\vec{x}-\vec{x}_a\right)^\top \mathbf{S}_a^{-1} \left(\vec{x}-\vec{x}_a\right),\end{aligned}\end{align} \]

where \(m\) is the number of measurements, \(\vec{x}_a\) is an a priori of \(\vec{x}\), \(\mathbf{S}_\epsilon\) is the covariance of the measurement noise, and \(\mathbf{S}_a\) is the covariance of the model state apriori.

Linearization

The optimal estimation methods in ARTS only work if it is possible to linearize \(F\left(\vec{x}\right)\) around \(\vec{x}\). This is equivalent to stating that

\[\vec{y} = \vec{y}_f + \mathbf{J} \left(\vec{x} - \vec{x}_a\right)\]

where we can take the partial derivative to find that

\[\mathbf{J} = \frac{\partial \vec{y}}{\partial \vec{x}}\]

is the Jacobian matrix (i.e., measurement_jac in the ARTS workspace).

One approach to minimize \(\vec{x}\) is to use a Gauss-Newton approach to update the state of the atmosphere. This might look like

\[\vec{x}_{i+1} = \vec{x}_a + \mathbf{S}_a \mathbf{J}^\top\left(\mathbf{J}\mathbf{S}_a \mathbf{J} ^\top+\mathbf{S}_\epsilon\right)^{-1} \left[\vec{y}-\vec{y}_f+ \mathbf{J}\left(\vec{x}_i-\vec{x}_a\right)\right],\]

\[\vec{y}_f = F\left(\vec{x}_i\right)\]

for \(i\) starting at \(a\) or \(0\) and increasing the count until the cost functions above are not decreasing anymore, or too slowly to warrant more iterations.

Transforming the Jacobian matrix

It is sometimes desired to transform the Jacobian matrix away from the native units that are available in the model. Instead of retrieving volume-mixing-ratio, it is sometimes more stable to retrieve the relative change in the volume-mixing-ratio, or perhaps even the logarithm of the volume-mixing-ratio, to avoid negative values creeping in to what is otherwise an intensive and numerical exercise.

There are three ways perform transformations on the Jacobian matrix in ARTS. You can change the measurement unit (e.g., from spectral radiance to brightness temperature), you can change the model state unit (e.g., from volume-mixing-ratio to relative humidity), or you can map the Jacobian matrix to another vector space (e.g., from Cartesian coordinates to spherical coordinates).

The first type of transformations are performed by an SpectralRadianceTransformOperator in ARTS. This is a local operation that is almost trivial to undo. Generally, you should just use one of the provided enumeration values of SpectralRadianceUnitType to set up these types of transformations. The enumeration class also describes how the transformations are done. This will not be repeated here.

The second and third types of transformations are a bit more complicated because they transform the model state vector. The optimal estimation methods are sometimes iterative, and they must be able to update the state of the model between iterations. These two types of transformations are therefore required to be able to transform the model state vector from native to non-native units, and vice versa, as well as to transform the Jacobian matrix from native to non-native units (but only in one-way, this operation does not need to be reversible).

From a mathematical point of view, these two behave very similar. But from a practical point of view they are quite different. Transforming the model state vector requires just the model state vector of the native units and the Jacobian matrix that corresponds to just that model state parameter. Mapping the model state vector to non-native units may require knowing more than just a single model state parameter.

Core mapping/transformation expression

If we define the native units of \(\vec{x}\) as \(\vec{t}\) so that

\[\vec{x} = f\left(\vec{t}\right),\]

there must be a reversible functions so that

\[\vec{t} = f^{-1}\left(\vec{x}\right)\]

for any transformation or mapping to work. It must also be possible to take the partial derivative of \(\vec{t}\) with regards to \(\vec{x}\).

If we put this in the form of the linearized forward simulation,

\[\vec{y}_f = \mathbf{J} \vec{x} = \mathbf{J} f\left(\vec{t}\right).\]

Here \(\mathbf{J}\) is still the partial derivative with regards to \(\vec{x}\). However, all partial derivatives will have been computed in terms of \(\vec{t}\), since this is the native unit. If we introduce

\[\mathbf{J}' = \frac{\partial \vec{y}}{\partial \vec{t}},\]

it is clear we can write

\[\mathbf{J} = \mathbf{J}' \frac{\partial}{\partial \vec{x}} f^{-1}\left(\vec{x}\right).\]

This step right here is what we consider the transformation of the Jacobian matrix. To make use of this style of transformation, we must provide matching \(f\) and \(f^{-1}\), as well as a way to compute the partial derivative of \(f^{-1}\) with regards to \(\vec{x}\).

We provide several such solutions built-in to ARTS as listed below but it is possible to specify these directly from python by simply providing the three operators above.

Relative retrievals

This is a model state vector transformation. By relative retrievals, we mean that the value itself is not retrieved, but instead its ratio is retrieved.

In this scenario:

\[\vec{x} = \vec{t} \oslash \vec{t}_0,\]

\[\vec{t} = \vec{x} \odot \vec{t}_0,\]

\[\mathbf{J} = \mathbf{J}' \odot \vec{t}_0,\]

where \(\oslash\) and \(\odot\) are element-wise division and multiplication, respectively. \(\vec{t}_0\) is simply the a priori value of \(\vec{t}\).

Note

The first iteration of a retrieval setup is going to be \(\vec{x} = \vec{1}\).

Logarithmic retrievals

This is a model state vector transformation. By logarithmic retrievals, we mean that the value itself is not retrieved, but instead its logarithm is retrieved.

In this scenario:

\[\vec{x} = \log\left(\vec{t}\right),\]

\[\vec{t} = \exp\left(\vec{x}\right),\]

\[\mathbf{J} = \mathbf{J}' \odot \exp\left(\vec{x}\right),\]

where the exponential and logarithmic operations are element-wise.

Logarithmic relative retrievals

This is a model state vector transformation. By logarithmic relative retrievals, we mean that the value itself is not retrieved, but instead the logarithm of its relative value is retrieved.

In this scenario:

\[\vec{x} = \log\left(\vec{t} \oslash \vec{t}_0\right),\]

\[\vec{t} = \exp\left(\vec{x}\right) \odot \vec{t}_0,\]

\[\mathbf{J} = \mathbf{J}' \odot \exp\left(\vec{x}\right) \odot \vec{t}_0,\]

where the operations are still element-wise on the product that is created.

Note

The first iteration of a retrieval setup is going to have \(\vec{x} = \vec{0}\).

Relative humidity retrievals

This is a model state vector transformation. By relative humidity retrievals, we mean that the value itself is not retrieved, but instead the its conversion to relative humidity is retrieved.

In this scenario:

\[\vec{x} = \vec{t} \odot \vec{p} \oslash p_{\textrm{sat}}\left(\vec{T}\right),\]

\[\vec{t} = \vec{x} \odot p_{\textrm{sat}}\left(\vec{T}\right) \oslash \vec{p},\]

\[\mathbf{J} = \mathbf{J}' \odot p_{\textrm{sat}}\left(\vec{T}\right) \oslash \vec{p},\]

where \(\vec{p}\) is the pressure at the position of \(\vec{t}\), \(\vec{T}\) is the temperature at the position of \(\vec{t}\), and \(p_{\textrm{sat}}\) is a user-provided method to compute the element-wise saturation pressure.

Tip

There is a flag that can be provided to this transformation that turns negative relative humidities off.

Note

Be aware that the implementation in ARTS is general, and that while you can choose to treat temperature as, e.g., relative humidity… please don’t. It makes sense only for some species.

Absolute field retrievals

This is a model state vector mapping. By absolute field retrievals, we mean that the value itself is not retrieved, but instead the absolute value of the field is retrieved.

In this scenario:

\[\begin{split}\begin{array}{rcl} \vec{x} &=& \sqrt{\vec{t}_u \odot \vec{t}_u + \vec{t}_v \odot \vec{t}_v + \vec{t}_w \odot \vec{t}_w},\\ \vec{\theta} &=& \arcsin\left(\vec{t}_w \oslash \vec{x}\right),\\ \vec{\phi} &=& \arctan\left(\vec{t}_v \oslash \vec{t}_u\right),\\ \vec{t}_u &=& \vec{x} \odot \cos\left(\vec{\theta}\right) \odot \cos\left(\vec{\phi}\right),\\ \vec{t}_v &=& \vec{x} \odot \cos\left(\vec{\theta}\right) \odot \sin\left(\vec{\phi}\right),\\ \vec{t}_w &=& \vec{x} \odot \sin\left(\vec{\theta}\right),\\ \end{array}\end{split}\]

\[\mathbf{J} = \mathbf{J}_u' \odot \vec{t}_u \oslash \vec{x} + \mathbf{J}_v' \odot \vec{t}_v \oslash \vec{x} + \mathbf{J}_w' \odot \vec{t}_w \oslash \vec{x},\]

where the subscripts \(u\), \(v\), and \(w\) indicate the three components of the vector north, east, and up, respectively, and \(\mathbf{J}_u'\), \(\mathbf{J}_v'\), and \(\mathbf{J}_w'\) are the Jacobian matrices of these three components in the native units of the model state vector.

Note

Neither \(\vec{\theta}\) nor \(\vec{\phi}\) are part of the model state vector, but are instead derived from the model state vector and are used to map the Cartesian coordinates to spherical coordinates. They are fixed during the retrieval process and are not updated between iterations.