int_optimizer module¶

itrails.int_optimizer.backtrack_viterbi(omega, prev)[source]¶

Reconstructs the optimal Viterbi path by backtracking through the pointer matrix obtained from the viterbi function.

Parameters:

omega (numpy array.) – Omega matrix containing the log probabilities for each time step and state.
prev (numpy array. :return: Optimal Viterbi path as an array of state indices.) – Backtracking pointer matrix from the Viterbi algorithm.

Return type:

numpy array.

itrails.int_optimizer.backward(a, b, V, order)[source]¶

Performs the backward algorithm for Hidden Markov Models by computing the beta values in log space; :type a: numpy array. :param a: Transition probability matrix.

Parameters:

b (numpy array.) – Emission probability matrix.
V (numpy array.) – Vector of observed states (as integer indices).
order (list.) – List of indices mapping observed states to emission probabilities.

Returns:

Beta matrix with log probabilities for each time step and hidden state.

Return type:

numpy array.

itrails.int_optimizer.forward(a, b, pi, V, order)[source]¶

Executes the forward algorithm for Hidden Markov Models allowing for missing data by computing the log-scaled alpha values.

Parameters:

a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V (numpy array.) – Vector of observed states (as integer indices).
order (list.) – List of indices mapping observed states to emission probabilities.

Returns:

Alpha matrix with log probabilities for each time step and hidden state.

Return type:

numpy array.

itrails.int_optimizer.forward_loglik(a, b, pi, V, order)[source]¶

Computes the log-likelihood for a given observed state sequence by running the forward algorithm and applying log-sum-exp for numerical stability.

Parameters:

a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V (numpy array.) – Vector of observed states (as integer indices).
order (list. :return: Log-likelihood value.) – List of indices mapping observed states to emission probabilities.

Return type:

float.

itrails.int_optimizer.forward_loglik_par(a, b, pi, V, order)[source]¶

Computes the log-likelihood in parallel by converting the provided order into a List and then calling forward_loglik.

Parameters:

a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V (numpy array.) – Vector of observed states represented as integer indices.
order (list.) – List of indices mapping observed states to emission probabilities.

Returns:

Log-likelihood value.

Return type:

float.

itrails.int_optimizer.loglik_wrapper(a, b, pi, V_lst)[source]¶

Sequential log-likelihood wrapper that builds an order list using get_idx_state, iteratively computes the log-likelihood for each observed state vector in V_lst, and returns the total sum.

Parameters:

a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V_lst (list[np.ndarray].) – List of observed state vectors (as numpy arrays of integer indices).

Returns:

Sum of log-likelihood values over all observed sequences in V_lst.

Return type:

float.

itrails.int_optimizer.loglik_wrapper_new_method(a, b, pi, V_lst)[source]¶

Sequential log-likelihood wrapper that builds an order list using get_idx_state_new_method, computes the log-likelihood for each observed state vector in V_lst sequentially, and returns the total sum.

Parameters:

a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V_lst (list[np.ndarray].) – List of observed state vectors as numpy arrays of integer indices.

Returns:

Total log-likelihood value summed over all observed sequences in V_lst.

Return type:

float.

itrails.int_optimizer.loglik_wrapper_par(a, b, pi, V_lst)[source]¶

Parallel log-likelihood wrapper that builds an order list using get_idx_state, then computes the log-likelihood for each observed state vector in V_lst in parallel using joblib, and returns the sum of the results.

Parameters:

a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities for the hidden states.
V_lst (list[np.ndarray].) – List of observed state vectors, where each vector is a numpy array of integer indices.

Returns:

Sum of log-likelihood values over all observed sequences in V_lst.

Return type:

float.

itrails.int_optimizer.loglik_wrapper_par_new_method(a, b, pi, V_lst)[source]¶

Parallel log-likelihood wrapper using joblib that builds an order list via get_idx_state_new_method, computes the log-likelihood for each observed state vector in V_lst in parallel, and returns the sum of the computed values.

Parameters:

a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities for the hidden states.
V_lst (list[np.ndarray].) – List of observed state vectors as numpy arrays of integer indices.

Returns:

Total log-likelihood value summed over all observed sequences in V_lst.

Return type:

float.

itrails.int_optimizer.optimization_wrapper_introgression(arg_lst, optimized_params, case, d, V_lst, res_name, info)[source]¶

itrails.int_optimizer.optimizer_introgression(optim_variables, optim_list, bounds, fixed_params, V_lst, res_name, case, n_iter, method='Nelder-Mead', header=True, tmp_path='./')[source]¶

Optimization function.

Parameters:

optim_params (dictionary) –
Dictionary containing the initial values for the parameters to be optimized, and their optimization bounds. The structure of the dictionary should be as follows:

dct[‘variable’] = [initial_value, lower_bound, upper_bound]

The dictionary must contain either 8 (t_1, t_2, t_upper, t_m, N_AB, N_ABC, r, m), 10 (t_A, t_B, t_C, t_2, t_upper, t_m, N_AB, N_ABC, r, m), or 11 (t_A, t_B, t_C, t_2, t_upper, t_out, t_m, N_AB, N_ABC, r, m) entries, in that specific order.
params (fixed) – Dictionary containing the values for the fixed parameters. The dictionary must contain entries n_int_AB and n_int_ABC (in no particular order).
V_lst (list of numpy arrays) – List of arrays of integers corresponding to the the observed states.
res_name (str) – Location and name of the gile where the results should be saved (in csv format).

itrails.int_optimizer.post_prob(a, b, pi, V, order)[source]¶

Computes the posterior probabilities of the hidden states for a given observed sequence by combining the forward and backward algorithms and normalizing the result.

Parameters:

a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V (numpy array.) – Vector of observed states (as integer indices).
order (list.) – List of indices mapping observed states to emission probabilities.

Returns:

Posterior probability matrix with probabilities for each time step and hidden state.

Return type:

numpy array.

itrails.int_optimizer.post_prob_wrapper(a, b, pi, V_lst)[source]¶

Wrapper function that computes the posterior probabilities for a list of observed state sequences by building an order list using get_idx_state and iteratively applying post_prob to each sequence.

Parameters:

a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V_lst (list[np.ndarray].) – List of observed state vectors (as numpy arrays of integer indices).

Returns:

List of posterior probability matrices corresponding to each observed sequence in V_lst.

Return type:

list.

itrails.int_optimizer.viterbi(a, b, pi, V, order)[source]¶

Computes the Viterbi path by performing the forward pass in log space with dynamic programming and returning both the omega matrix and the backtracking pointer matrix.

Parameters:

a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V (numpy array.) – Vector of observed states (as integer indices).
order (list.) – List of indices mapping observed states to emission probabilities.

Returns:

Tuple containing the omega matrix and the backtracking pointer matrix.

Return type:

tuple(numpy array, numpy array).

itrails.int_optimizer.viterbi_old(a, b, pi, V, order)[source]¶

Computes the Viterbi path using an iterative approach in log space by performing dynamic programming over the given observed sequence.

Parameters:

a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V (numpy array.) – Vector of observed states.
order (list.) – List of indices mapping observed states to emission probabilities.

Returns:

Viterbi path as an array of state indices.

Return type:

numpy array.

itrails.int_optimizer.viterbi_wrapper(a, b, pi, V_lst)[source]¶

Wrapper for the Viterbi algorithm that builds an order list using get_idx_state, applies the viterbi and backtrack_viterbi functions to each observed state vector in V_lst, and returns a list of Viterbi paths.

Parameters:

a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V_lst (list[np.ndarray].) – List of observed state vectors (as numpy arrays of integer indices).

Returns:

List of Viterbi paths corresponding to each observed sequence in V_lst.

Return type:

list.

itrails.int_optimizer.write_list(lst, res_name)[source]¶

Appends the elements of a list as a comma-separated line to a CSV file with the given file name.

Parameters:

lst (list.) – List of values to append.
res_name (str.) – File name (or path) to which the list should be appended.

Returns:

None.