DPD acceptance matrix

DPD acceptance matrix#

Introduction#

To speed up the fit, we can precompute the part of the negative log likelihood (NLL) function that normalizes the intensity function. Normally, you normalize the intensity function numerically with Monte-Carlo integration over a phase space sample (with or without detector efficiency). When we only vary scaling factors (couplings) in the amplitude model (i.e., masses and widths fixed), it is possible to precompute the normalization, as the normalization factor changes when we vary these non-linear parameters.

Statistically, this approach can be derived as follows. Given total sample \(\vec{x}\) consisting of \(N\) independent observations of a set, the likelihood function can be written as the product of the Probability Density Functions (PDFs) \(f\) of each single observation:

\[L(\vec{x};\vec{\theta}) = \prod_{\mathfrak{i}=1}^N f\left(x_i;\vec{\theta}\right)\,.\]

Here, \(\vec{\theta}\) represents the set of \(m\) unknown parameters that the PDF depends on. In an amplitude analysis, the PDFs are the normalized intensity functions for each partial wave:

\[f(x_i;\vec{\theta}) = \frac{I(x_i;\vec{\theta})}{\int I(\vec{x};\vec{\theta})\,\mathrm{d}\vec{x}}\,.\]

The variables \(\vec{x}_i\) now represent the Mandelstam variables and the decay and production angles of each event \(i\). With this expression for \(f\), we can rewrite the NLL function as

\[\begin{split} \begin{array}{rcl} \mathrm{NLL}(x_i;\vec{\theta}) &=& -\sum\limits^n_{i=1} \log \left(\frac{I(x_i;\vec{\theta})}{\int I(\vec{x};\vec{\theta}) d\vec{x}}\right) \\ &=& - \underbrace{\sum\limits^n_{i=1} \log(I(x_i;\vec{\theta}))}_{\text{data}} + \underbrace{n\cdot \log\left(\int I(\vec{x};\vec{\theta}) d\vec{x}\right)}_{\text{normalization}} \,. \end{array} \end{split}\]

Since the second term integrates over all phase-space points, it is independent of \(x_i\) and can be approximated using Monte Carlo integration:

\[ \int I(\vec{x} \,; \vec{\theta}) \, d\vec{x} = c^{\dagger} \cdot \frac{1}{N_{\mathrm{MC}}} \sum_{j=1}^{N_{\mathrm{MC}}} I(\vec{x}_j^{\mathrm{MC}} \,; \vec{\theta}) \cdot c \cdot \left( \int d\vec{x} \right) \,. \]

The term \(\frac{1}{N_{\mathrm{MC}}} \sum_{j=1}^{N_{\mathrm{MC}}} I(\vec{x}_j^{\mathrm{MC}};\vec{\theta})\) represents the so-called acceptance matrix, a hermitian matrix, which we label \(X\) from now on. The acceptance matrix can be used obtain integrated intensity via the bilinear relation,

\[ \int I(\vec{x}) \, \mathrm{d}\vec{x}=c^{\dagger} X c \,. \]

Construction of the acceptance matrix#

The acceptance matrix can be constructed by probing with four different value combinations for the \(c_ij\) leading to the four equations:

\[\begin{split} \begin{array}{rcccc} A_{ij}^{+} &=& \int I(c_i=1, c_j=1) \, \mathrm{d}\vec{x} &=& \begin{bmatrix} 1 & 1 \end{bmatrix} \begin{bmatrix} X_{ii} & X_{ij} \\ X_{ji} & X_{jj} \\ \end{bmatrix} \begin{bmatrix} 1 \\ 1 \end{bmatrix} \\ A_{ij}^{-} &=& \int I(c_i=1, c_j=-1) \, \mathrm{d}\vec{x} &=& \begin{bmatrix} 1 & -1 \end{bmatrix} \begin{bmatrix} X_{ii} & X_{ij} \\ X_{ji}& X_{jj} \\ \end{bmatrix} \begin{bmatrix} 1 \\ -1 \end{bmatrix} \\ B_{ij}^{+} &=& \int I(c_i=1, c_j=i) \, \mathrm{d}\vec{x} &=& \begin{bmatrix} 1 & -i \end{bmatrix} \begin{bmatrix} X_{ii} & X_{ij} \\ X_{ji}& X_{jj} \\ \end{bmatrix} \begin{bmatrix} 1 \\ i \end{bmatrix} \\ B_{ij}^{-} &=& \int I(c_i=1, c_j=-i) \, \mathrm{d}\vec{x} &=& \begin{bmatrix} 1 & i \end{bmatrix} \begin{bmatrix} X_{ii} & X_{ij} \\ X_{ji}& X_{jj} \\ \end{bmatrix} \begin{bmatrix} 1 \\ -i \end{bmatrix} \end{array} \end{split}\]

The \(A_{ij}\) and \(B_{ij}\) are the elements of the sub-intensity matrix, the results of the integration over the phase space when setting \(c_i\) and \(c_j\) to the respective values and the rest of to couplings to zero.\ The elements of \(X\) can finally be expressed as:

\[ X_{ij} = \frac{A_{ij}^+-A_{ij}^-}{4} + i\cdot\frac{B_{ij}^+-B_{ij}^-}{4} \,. \]

As you can see in the expression above, two equations are needed for the real part of \(X_{ij}\) and two for the imaginary part.

Speed up the construction of \(X\)#

Using the four equations described above is not computationally efficient, we have to compute the sub-intensity matrix for all \(4m^2\) combinations of \(c_i\) and \(c_j\), with \(m\) the number of couplings. However, we can use the hermitian property of the acceptance matrix to reduce the required number of equations to construct the matrix. A hermitian matrix is mirror symmetric with respect to its main diagonal, up to the complex conjugation of all entries. Therefore one only has to calculate the upper triangle of \(X\) and then fill the lower triangle with its complex-conjugate transpose. As probes the vectors with \(c_i=1,c_j=1\) are used to obtain the real part, while \(c_i=i,c_j=1\) is used to obtain the imaginary part of the elements of \(X_{ij}\). The diagonal of \(X\) is real-valued and is equal to the branching fractions (that is, the diagonal of the sub-intensity matrix with \(c_i=c_j=1\)).

With this, we can compute the acceptance matrix as

\[ X_{ij} = \frac{A_{c_i=1,c_j=1}+A_{c_i=i,c_j=1}-(1 + i) (\text{diagonal}[i] + \text{diagonal}[j])}{2} \,. \]

Prepare model and data#

Define reaction#

Define amplitude model#

Show code cell source Hide code cell source

def prepare_for_phsp(model: AmplitudeModel) -> AmplitudeModel:
    p1, p2, p3 = (FourMomentumSymbol(f"p{i}", shape=[]) for i in (1, 2, 3))
    s1, s2, s3 = sp.symbols("sigma1:4", nonnegative=True)
    mass_definitions = {
        s1: InvariantMassSquared(p2 + p3),
        s2: InvariantMassSquared(p1 + p3),
        s3: InvariantMassSquared(p1 + p2),
        sp.Symbol("m0", nonnegative=True): InvariantMass(p1 + p2 + p3),
        sp.Symbol("m1", nonnegative=True): InvariantMass(p1),
        sp.Symbol("m2", nonnegative=True): InvariantMass(p2),
        sp.Symbol("m3", nonnegative=True): InvariantMass(p3),
    }
    mass_definitions = {k: sp.sympify(v) for k, v in mass_definitions.items()}
    angle_and_mandelstam_definitions = {
        symbol: expr.xreplace(mass_definitions)
        for symbol, expr in model.variables.items()
    }
    angle_and_mandelstam_definitions.update(mass_definitions)
    polarized_intensity = set_initial_state_polarization(
        model.intensity,
        spin_projections=(-1, +1),
    )
    new_parameter_defaults = {
        symbol: v
        for symbol, v in model.parameter_defaults.items()
        if not re.match(r"m[0-3]", symbol.name)
    }
    return attrs.evolve(
        model,
        intensity=polarized_intensity,
        variables=angle_and_mandelstam_definitions,
        parameter_defaults=new_parameter_defaults,
    )


def set_initial_state_polarization(
    intensity: PoolSum, spin_projections: Iterable[sp.Rational | float]
) -> PoolSum:
    helicity_symbol, _ = intensity.indices[0]
    helicity_values = tuple(sp.Rational(i) for i in spin_projections)
    new_indices = (
        (helicity_symbol, helicity_values),
        *intensity.indices[1:],
    )
    return PoolSum(intensity.expression, *new_indices)


@unevaluated
class InvariantMassSquared(sp.Expr):
    momentum: sp.Basic
    _latex_repr_ = R"M^2\left({momentum}\right)"

    def evaluate(self) -> sp.Expr:
        p = self.momentum
        p_xyz = ThreeMomentum(p)
        return Energy(p) ** 2 - EuclideanNorm(p_xyz) ** 2

decay = to_three_body_decay(reaction.transitions, min_ls=False)
builder = DalitzPlotDecompositionBuilder(decay, min_ls=False)
for chain in builder.decay.chains:
    builder.dynamics_choices.register_builder(chain, formulate_breit_wigner_with_ff)
model = builder.formulate(reference_subsystem=2, cleanup_summations=True)
model = prepare_for_phsp(model)
model.intensity.cleanup()

\[\displaystyle \sum_{\lambda_{0}\in\left\{-1,1\right\}} \sum_{\lambda_{2}=-1/2}^{1/2} \sum_{\lambda_{3}=-1/2}^{1/2}{\left|{\sum_{\lambda_0^{\prime}=-1}^{1} \sum_{\lambda_2^{\prime}=-1/2}^{1/2} \sum_{\lambda_3^{\prime}=-1/2}^{1/2}{A^{2}_{\lambda_0^{\prime}, 0, \lambda_2^{\prime}, \lambda_3^{\prime}} d^{\frac{1}{2}}_{\lambda_2^{\prime},\lambda_{2}}\left(\zeta^2_{2(2)}\right) d^{\frac{1}{2}}_{\lambda_3^{\prime},\lambda_{3}}\left(\zeta^3_{2(2)}\right) d^{1}_{\lambda_{0},\lambda_0^{\prime}}\left(\zeta^0_{2(2)}\right) + A^{3}_{\lambda_0^{\prime}, 0, \lambda_2^{\prime}, \lambda_3^{\prime}} d^{\frac{1}{2}}_{\lambda_2^{\prime},\lambda_{2}}\left(\zeta^2_{3(2)}\right) d^{\frac{1}{2}}_{\lambda_3^{\prime},\lambda_{3}}\left(\zeta^3_{3(2)}\right) d^{1}_{\lambda_{0},\lambda_0^{\prime}}\left(\zeta^0_{3(2)}\right)}}\right|^{2}}\]

Generate phase space sample#

dpd_transformer = SympyDataTransformer.from_sympy(model.variables, backend="jax")

phsp

{'\\zeta^0_{2(2)}': Array(0, dtype=int64, weak_type=True),
 '\\zeta^0_{3(2)}': Array([-2.53357417, -2.43246297, -2.75588118, ..., -2.88331429, -2.45355527, -2.89980823], dtype=float64),
 '\\zeta^2_{2(2)}': Array(0, dtype=int64, weak_type=True),
 '\\zeta^2_{3(2)}': Array([0.48645084, 0.58458208, 0.26202874, ..., 0.34503341, 0.77149637, 0.18183345], dtype=float64),
 '\\zeta^3_{2(2)}': Array(0, dtype=int64, weak_type=True),
 '\\zeta^3_{3(2)}': Array([0.78587981, 0.67615773, 1.17998783, ..., 0.56729844, 0.50850767, 1.20546428], dtype=float64),
 'm0': Array([3.0969, 3.0969, 3.0969, ..., 3.0969, 3.0969, 3.0969], dtype=float64),
 'm1': Array([0.497611, 0.497611, 0.497611, ..., 0.497611, 0.497611, 0.497611], dtype=float64),
 'm2': Array([1.18937, 1.18937, 1.18937, ..., 1.18937, 1.18937, 1.18937], dtype=float64),
 'm3': Array([0.93827209, 0.93827209, 0.93827209, ..., 0.93827209, 0.93827209, 0.93827209], dtype=float64),
 'sigma1': Array([5.9276109 , 5.8019359 , 6.08202612, ..., 6.50903172, 5.84419671, 6.24721317], dtype=float64),
 'sigma2': Array([2.51882444, 2.67122   , 2.23124228, ..., 2.27933047, 2.81951794, 2.13741812], dtype=float64),
 'sigma3': Array([3.68692649, 3.66020593, 3.82009343, ..., 3.34499964, 3.46964718, 3.74873053], dtype=float64),
 'theta_12': Array([2.00195379, 1.79913544, 2.46359496, ..., 2.51073213, 1.63263745, 2.69699582], dtype=float64),
 'theta_31': Array([1.46225035, 1.45618234, 1.61596876, ..., 0.86704215, 1.26372834, 1.48250258], dtype=float64)}

Prepare numerical function#

coupling_parameters = {
    symbol: value
    for symbol, value in model.parameter_defaults.items()
    if "production" in str(symbol)
}
fixed_parameters = {
    symbol: value
    for symbol, value in model.parameter_defaults.items()
    if symbol not in coupling_parameters
}
len(coupling_parameters)

full_intensity_expr = cached.unfold(model)
full_intensity_expr = cached.xreplace(full_intensity_expr, fixed_parameters)
intensity_func = cached.lambdify(
    full_intensity_expr,
    parameters=coupling_parameters,
    backend="jax",
)

Compute acceptance matrix#

Brute-force computation#

%%time
A_plus = compute_intensity_matrix(intensity_func, phsp, ci=1, cj=+1)
B_plus = compute_intensity_matrix(intensity_func, phsp, ci=1, cj=+1j)
A_minus = compute_intensity_matrix(intensity_func, phsp, ci=1, cj=-1)
B_minus = compute_intensity_matrix(intensity_func, phsp, ci=1, cj=-1j)

CPU times: user 24.2 s, sys: 1.75 s, total: 26 s
Wall time: 5.57 s

X_temp = (A_plus - A_minus) / 4 - 1j / 4 * (B_plus - B_minus)
diagonal = np.tril(A_plus) - np.tril(A_plus, k=-1)
X_brute_force = diagonal + np.tril(X_temp, k=-1) + np.triu(X_temp, k=1)
X_brute_force.round(7)

array([[ 9.18200e-04+0.0000e+00j,  1.72000e-05+1.7800e-04j,  0.00000e+00+0.0000e+00j,  2.80000e-06+3.1000e-06j],
       [ 1.72000e-05-1.7800e-04j,  1.95700e-03+0.0000e+00j,  0.00000e+00+0.0000e+00j, -8.19900e-04-3.1568e-03j],
       [ 0.00000e+00+0.0000e+00j,  0.00000e+00+0.0000e+00j,  0.00000e+00+0.0000e+00j,  0.00000e+00+0.0000e+00j],
       [ 2.80000e-06-3.1000e-06j, -8.19900e-04+3.1568e-03j,  0.00000e+00+0.0000e+00j,  9.51062e-02+0.0000e+00j]])

np.testing.assert_almost_equal(X_brute_force, X_brute_force.T.conj())

Smarter computation#

Functions for smarter computation Hide code cell source

def compute_acceptance_matrix(
    intensity_func: ParametrizedFunction,
    phsp: DataSample,
) -> np.ndarray:
    a_matrix = compute_intensity_matrix_upper_triangle(intensity_func, phsp, ci=1, cj=1)
    b_matrix = compute_intensity_matrix_upper_triangle(
        intensity_func, phsp, ci=1j, cj=1
    )
    diagonal = compute_intensity_matrix_diagonal(intensity_func, phsp)
    x_matrix = np.zeros(a_matrix.shape, dtype=complex)
    for i, j in itertools.product(range(a_matrix.shape[0]), range(a_matrix.shape[1])):
        if i < j:
            x_matrix[i, j] = (
                a_matrix[i, j]
                + 1j * b_matrix[i, j]
                - (1 + 1j) * (diagonal[i] + diagonal[j])
            ) / 2
    conjugate_upper_triangle = x_matrix.conjugate().T
    x_matrix += np.diag(diagonal) + conjugate_upper_triangle
    return x_matrix


def compute_intensity_matrix_upper_triangle(
    func: ParametrizedFunction, phsp: DataSample, ci: complex, cj: complex
) -> np.ndarray:
    original_parameters = func.parameters.copy()
    coupling_parameters = [p for p in func.parameters if "production" in p]
    n = len(coupling_parameters)
    sub_intensity_matrix = np.zeros((n, n))
    for (i, c1), (j, c2) in itertools.product(
        enumerate(coupling_parameters),
        enumerate(coupling_parameters),
    ):
        if i >= j:
            continue
        new_parameters = dict.fromkeys(coupling_parameters, 0)
        new_parameters[c1] = ci
        new_parameters[c2] = cj

        func.update_parameters(new_parameters)
        sub_intensities = func(phsp)
        func.update_parameters(original_parameters)

        sub_intensity_matrix[i, j] = integrate_intensity(sub_intensities)

    return sub_intensity_matrix


def compute_intensity_matrix_diagonal(
    func: ParametrizedFunction, phsp: DataSample
) -> np.ndarray:
    original_parameters = func.parameters.copy()
    coupling_parameters = [p for p in func.parameters if "production" in p]
    n = len(coupling_parameters)
    sub_intensity_array = np.zeros(n)

    for i, c1 in enumerate(coupling_parameters):
        new_parameters = dict.fromkeys(coupling_parameters, 0)
        new_parameters[c1] = 1
        func.update_parameters(new_parameters)
        sub_intensities = func(phsp)
        func.update_parameters(original_parameters)

        sub_intensity_array[i] = integrate_intensity(sub_intensities)

    return sub_intensity_array

%%time
X = compute_acceptance_matrix(intensity_func, phsp)

CPU times: user 4.27 s, sys: 475 ms, total: 4.74 s
Wall time: 636 ms

X.round(7)

array([[ 9.18200e-04+0.0000e+00j,  1.72000e-05+1.7800e-04j,  0.00000e+00+0.0000e+00j,  2.80000e-06+3.1000e-06j],
       [ 1.72000e-05-1.7800e-04j,  1.95700e-03+0.0000e+00j,  0.00000e+00+0.0000e+00j, -8.19900e-04-3.1568e-03j],
       [ 0.00000e+00+0.0000e+00j,  0.00000e+00+0.0000e+00j,  0.00000e+00+0.0000e+00j,  0.00000e+00+0.0000e+00j],
       [ 2.80000e-06-3.1000e-06j, -8.19900e-04+3.1568e-03j,  0.00000e+00+0.0000e+00j,  9.51062e-02+0.0000e+00j]])

np.testing.assert_almost_equal(X, X.T.conj())
np.testing.assert_almost_equal(X, X_brute_force)

Test bilinear form#

The the bilinear form with the acceptance matrix is to be used within for the optimization of the coupling parameters \(c_{ij}\). The bilinear form should lead to the same result when calculating the sub-intensity directly by Monte-Carlo integration. In the following this expected behavior is tested for different \(c\).

Result taking every element of \(c\) as \(2+1i\) or 0:

\(\mathcal{H}^\mathrm{LS,production}_{\overline{\Sigma}(1775)^{-}, 1, 2}\)	\(\mathcal{H}^\mathrm{LS,production}_{N(2060)^+, 1, 2}\)	\(\mathcal{H}^\mathrm{LS,production}_{\overline{\Sigma}(1750)^{-}, 1, 0}\)	\(\mathcal{H}^\mathrm{LS,production}_{\overline{\Sigma}(1750)^{-}, 1, 1}\)	bilinear	MC	relative difference
`0`	`0`	`0`	`0`	`0`	`0`	`0`
`0`	`0`	`0`	`-2+1j`	`0.47553`	`0.47553`	`1.1674e-16`
`0`	`0`	`-2+1j`	`0`	`0`	`0`	`0`
`0`	`0`	`-2+1j`	`-2+1j`	`0.47553`	`0.47553`	`1.1674e-16`
`0`	`-2+1j`	`0`	`0`	`0.0097851`	`0.0097851`	`0`
`0`	`-2+1j`	`0`	`-2+1j`	`0.47712`	`0.47712`	`0`
`0`	`-2+1j`	`-2+1j`	`0`	`0.0097851`	`0.0097851`	`0`
`0`	`-2+1j`	`-2+1j`	`-2+1j`	`0.47712`	`0.47712`	`0`
`-2+1j`	`0`	`0`	`0`	`0.004591`	`0.004591`	`1.8893e-16`
`-2+1j`	`0`	`0`	`-2+1j`	`0.48015`	`0.48015`	`0`
`-2+1j`	`0`	`-2+1j`	`0`	`0.004591`	`0.004591`	`1.8893e-16`
`-2+1j`	`0`	`-2+1j`	`-2+1j`	`0.48015`	`0.48015`	`0`
`-2+1j`	`-2+1j`	`0`	`0`	`0.014549`	`0.014549`	`1.1924e-16`
`-2+1j`	`-2+1j`	`0`	`-2+1j`	`0.48191`	`0.48191`	`0`
`-2+1j`	`-2+1j`	`-2+1j`	`0`	`0.014549`	`0.014549`	`1.1924e-16`
`-2+1j`	`-2+1j`	`-2+1j`	`-2+1j`	`0.48191`	`0.48191`	`0`

Now new values for the elements of \(c\) are generated by varying the magnitude of the \(c_i\) around one according the a log-normal distribution and the complex phase according to a uniform distribution. Note that this time each \(c_i\) has a different phase and magnitude.

\(\mathcal{H}^\mathrm{LS,production}_{\overline{\Sigma}(1775)^{-}, 1, 2}\)	\(\mathcal{H}^\mathrm{LS,production}_{N(2060)^+, 1, 2}\)	\(\mathcal{H}^\mathrm{LS,production}_{\overline{\Sigma}(1750)^{-}, 1, 0}\)	\(\mathcal{H}^\mathrm{LS,production}_{\overline{\Sigma}(1750)^{-}, 1, 1}\)	bilinear	MC	relative difference
`-1.066+2.538j`	`-2.29+1.398j`	`2.272+1.8j`	`0.353+2.724j`	`0.68991`	`0.68991`	`-1.6092e-16`
`-1.583+1.978j`	`-2.342-0.502j`	`0.326+2.533j`	`-1.229-2.437j`	`0.74913`	`0.74913`	`1.482e-16`
`-2.534-0.456j`	`-1.871-1.854j`	`1.355+2.487j`	`1.816+2.409j`	`0.88976+5.5511e-17j`	`0.88976`	`-1.2478e-16+6.2389e-17j`
`1.172+2.735j`	`1.606+2.225j`	`0.952+2.337j`	`1.899-1.593j`	`0.56611+5.5511e-17j`	`0.56611`	`0+9.8058e-17j`

\(\mathcal{H}^\mathrm{LS,production}_{\overline{\Sigma}(1775)^{-}, 1, 2}\)	\(\mathcal{H}^\mathrm{LS,production}_{N(2060)^+, 1, 2}\)	\(\mathcal{H}^\mathrm{LS,production}_{\overline{\Sigma}(1750)^{-}, 1, 0}\)	\(\mathcal{H}^\mathrm{LS,production}_{\overline{\Sigma}(1750)^{-}, 1, 1}\)	bilinear	MC	relative difference
`0`	`1`	`0`	`0+1j`	`0.10338`	`0.10338`	`-1.3424e-16`
`0`	`1`	`0`	`-0-1j`	`0.09075`	`0.09075`	`0`
`1`	`0+1j`	`0`	`0`	`0.0025192`	`0.0025192`	`-1.7215e-16`
`1`	`-0-1j`	`0`	`0`	`0.0032312`	`0.0032312`	`0`

Note: the last two comparisons where each element of \(c\) vector has a different complex part or phase