Computes a lower bound on the quadratic term in the log marginal likelihood of conjugate GPR. The bound is based on an auxiliary vector, v. For :math:`Q ≺ K` and :math:`r=y - Kv` .. math:: -0.5 * (rᵀQ⁻¹r + 2yᵀv - vᵀ K v ) <= -0.5 * yᵀK⁻¹y <= -0.5 * (2yᵀv
(self, common: SGPR.CommonTensors)
| 115 | |
| 116 | @inherit_check_shapes |
| 117 | def quad_term(self, common: SGPR.CommonTensors) -> tf.Tensor: |
| 118 | """ |
| 119 | Computes a lower bound on the quadratic term in the log |
| 120 | marginal likelihood of conjugate GPR. The bound is based on |
| 121 | an auxiliary vector, v. For :math:`Q ≺ K` and :math:`r=y - Kv` |
| 122 | |
| 123 | .. math:: |
| 124 | -0.5 * (rᵀQ⁻¹r + 2yᵀv - vᵀ K v ) <= -0.5 * yᵀK⁻¹y <= -0.5 * (2yᵀv - vᵀKv). |
| 125 | |
| 126 | Equality holds if :math:`r=0`, i.e. :math:`v = K⁻¹y`. |
| 127 | |
| 128 | If `self.aux_vec` is trainable, gradients are computed with |
| 129 | respect to :math:`v` as well and :math:`v` can be optimized |
| 130 | using gradient based methods. |
| 131 | |
| 132 | Otherwise, :math:`v` is updated with the method of conjugate |
| 133 | gradients (CG). CG is run until :math:`0.5 * rᵀQ⁻¹r <= ϵ`, |
| 134 | which ensures that the maximum bias due to this term is not |
| 135 | more than :math:`ϵ`. The :math:`ϵ` is the CG tolerance. |
| 136 | """ |
| 137 | x, y = self.data |
| 138 | err = y - self.mean_function(x) |
| 139 | sigma_sq = self.likelihood.variance |
| 140 | K = add_noise_cov(self.kernel.K(x), sigma_sq) |
| 141 | |
| 142 | A = common.A |
| 143 | LB = common.LB |
| 144 | preconditioner = NystromPreconditioner(A, LB, sigma_sq) |
| 145 | err_t = tf.transpose(err) |
| 146 | |
| 147 | v_init = self.aux_vec |
| 148 | if not v_init.trainable: |
| 149 | v = cglb_conjugate_gradient( |
| 150 | K, |
| 151 | err_t, |
| 152 | v_init, |
| 153 | preconditioner, |
| 154 | self._cg_tolerance, |
| 155 | self._max_cg_iters, |
| 156 | self._restart_cg_iters, |
| 157 | ) |
| 158 | else: |
| 159 | v = v_init |
| 160 | |
| 161 | Kv = v @ K |
| 162 | r = err_t - Kv |
| 163 | _, error_bound = preconditioner(r) |
| 164 | lb = tf.reduce_sum(v * (r + 0.5 * Kv)) |
| 165 | ub = lb + 0.5 * error_bound |
| 166 | |
| 167 | if not v_init.trainable: |
| 168 | v_init.assign(v) |
| 169 | |
| 170 | return -ub |
| 171 | |
| 172 | @inherit_check_shapes |
| 173 | def predict_f( |