optimizer module¶
- itrails.optimizer.backtrack_viterbi(omega, prev)[source]¶
Reconstructs the optimal Viterbi path by backtracking through the pointer matrix obtained from the viterbi function.
- Parameters:
omega (numpy array.) – Omega matrix containing the log probabilities for each time step and state.
prev (numpy array. :return: Optimal Viterbi path as an array of state indices.) – Backtracking pointer matrix from the Viterbi algorithm.
- Return type:
numpy array.
- itrails.optimizer.backward(a, b, V, order)[source]¶
Performs the backward algorithm for Hidden Markov Models by computing the beta values in log space; :type a: numpy array. :param a: Transition probability matrix.
- Parameters:
b (numpy array.) – Emission probability matrix.
V (numpy array.) – Vector of observed states (as integer indices).
order (list.) – List of indices mapping observed states to emission probabilities.
- Returns:
Beta matrix with log probabilities for each time step and hidden state.
- Return type:
numpy array.
- itrails.optimizer.forward(a, b, pi, V, order)[source]¶
Executes the forward algorithm for Hidden Markov Models allowing for missing data by computing the log-scaled alpha values.
- Parameters:
a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V (numpy array.) – Vector of observed states (as integer indices).
order (list.) – List of indices mapping observed states to emission probabilities.
- Returns:
Alpha matrix with log probabilities for each time step and hidden state.
- Return type:
numpy array.
- itrails.optimizer.forward_loglik(a, b, pi, V, order)[source]¶
Computes the log-likelihood for a given observed state sequence by running the forward algorithm and applying log-sum-exp for numerical stability.
- Parameters:
a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V (numpy array.) – Vector of observed states (as integer indices).
order (list. :return: Log-likelihood value.) – List of indices mapping observed states to emission probabilities.
- Return type:
float.
- itrails.optimizer.forward_loglik_par(a, b, pi, V, order)[source]¶
Computes the log-likelihood in parallel by converting the provided order into a List and then calling forward_loglik.
- Parameters:
a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V (numpy array.) – Vector of observed states represented as integer indices.
order (list.) – List of indices mapping observed states to emission probabilities.
- Returns:
Log-likelihood value.
- Return type:
float.
- itrails.optimizer.loglik_wrapper(a, b, pi, V_lst)[source]¶
Sequential log-likelihood wrapper that builds an order list using get_idx_state, iteratively computes the log-likelihood for each observed state vector in V_lst, and returns the total sum.
- Parameters:
a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V_lst (list[np.ndarray].) – List of observed state vectors (as numpy arrays of integer indices).
- Returns:
Sum of log-likelihood values over all observed sequences in V_lst.
- Return type:
float.
- itrails.optimizer.loglik_wrapper_new_method(a, b, pi, V_lst)[source]¶
Sequential log-likelihood wrapper that builds an order list using get_idx_state_new_method, computes the log-likelihood for each observed state vector in V_lst sequentially, and returns the total sum.
- Parameters:
a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V_lst (list[np.ndarray].) – List of observed state vectors as numpy arrays of integer indices.
- Returns:
Total log-likelihood value summed over all observed sequences in V_lst.
- Return type:
float.
- itrails.optimizer.loglik_wrapper_par(a, b, pi, V_lst)[source]¶
Parallel log-likelihood wrapper that builds an order list using get_idx_state, then computes the log-likelihood for each observed state vector in V_lst in parallel using joblib, and returns the sum of the results.
- Parameters:
a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities for the hidden states.
V_lst (list[np.ndarray].) – List of observed state vectors, where each vector is a numpy array of integer indices.
- Returns:
Sum of log-likelihood values over all observed sequences in V_lst.
- Return type:
float.
- itrails.optimizer.loglik_wrapper_par_new_method(a, b, pi, V_lst)[source]¶
Parallel log-likelihood wrapper using joblib that builds an order list via get_idx_state_new_method, computes the log-likelihood for each observed state vector in V_lst in parallel, and returns the sum of the computed values.
- Parameters:
a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities for the hidden states.
V_lst (list[np.ndarray].) – List of observed state vectors as numpy arrays of integer indices.
- Returns:
Total log-likelihood value summed over all observed sequences in V_lst.
- Return type:
float.
- itrails.optimizer.optimization_wrapper(arg_lst, optimized_params, case, d, V_lst, res_name, info)[source]¶
Objective function for the optimizer that updates a copy of the fixed parameter dictionary with the current optimized values from arg_lst, computes additional derived parameters based on the specified case (which determines how time parameters are combined), calculates the transition probability, emission, and initial state probability matrices via trans_emiss_calc, evaluates the log-likelihood for the observed data V_lst using either a parallel or sequential log-likelihood wrapper depending on available CPUs, logs the evaluation to an optimization history CSV file, updates the best model if the current log-likelihood improves upon the previous best, increments the evaluation count, and returns the negative log-likelihood value (to be minimized).
- Parameters:
arg_lst (numpy array.) – Array of parameter values to be optimized that will update the fixed parameter dictionary.
optimized_params – List of parameter names (keys in d) that are subject to optimization. :type optimized_params: list[str].
case – A frozenset specifying the combination of time parameters provided (e.g., frozenset([“t_A”, “t_B”, “t_C”]) or frozenset([“t_1”, “t_A”]), etc.). :type case: frozenset.
d (dict.) – Dictionary of fixed parameter values.
V_lst (list[np.ndarray].) – List of observed state arrays, where each array is a numpy array of integer indices representing observed states.
res_name (str.) – Directory path where result files (e.g., best_model.yaml and optimization_history.csv) will be saved.
info (dict.) – Dictionary containing optimization metadata, including “Nfeval” (the number of evaluations so far) and “time” (the start time of the optimization run).
- Returns:
Negative log-likelihood value (to be minimized by the optimizer).
- Return type:
float.
- itrails.optimizer.optimizer(optim_variables, optim_list, bounds, fixed_params, V_lst, res_name, case, n_iter, method='Nelder-Mead', header=True)[source]¶
Optimization function.
- Parameters:
optim_params (dict) – Dictionary containing the initial values for the parameters to be optimized, and their optimization bounds. The structure of the dictionary should be as follows:
dct['variable'] = [initial_value, lower_bound, upper_bound]. The dictionary must contain either 6 entries (t_1, t_2, t_upper, N_AB, N_ABC, r) or 9 entries (t_A, t_B, t_C, t_2, t_upper, t_out, N_AB, N_ABC, r) in that specific order.fixed_params (dict) – Dictionary containing the values for the fixed parameters. The dictionary must include the entries
n_int_ABandn_int_ABC(in no particular order).V_lst (list) – List of numpy arrays of integers corresponding to the observed states.
res_name (str) – File path and name where the results should be saved (in CSV format).
- Returns:
None. This function updates the results on each iteration of the minimizer.
- Return type:
None
- itrails.optimizer.post_prob(a, b, pi, V, order)[source]¶
Computes the posterior probabilities of the hidden states for a given observed sequence by combining the forward and backward algorithms and normalizing the result.
- Parameters:
a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V (numpy array.) – Vector of observed states (as integer indices).
order (list.) – List of indices mapping observed states to emission probabilities.
- Returns:
Posterior probability matrix with probabilities for each time step and hidden state.
- Return type:
numpy array.
- itrails.optimizer.post_prob_wrapper(a, b, pi, V_lst)[source]¶
Wrapper function that computes the posterior probabilities for a list of observed state sequences by building an order list using get_idx_state and iteratively applying post_prob to each sequence.
- Parameters:
a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V_lst (list[np.ndarray].) – List of observed state vectors (as numpy arrays of integer indices).
- Returns:
List of posterior probability matrices corresponding to each observed sequence in V_lst.
- Return type:
list.
- itrails.optimizer.viterbi(a, b, pi, V, order)[source]¶
Computes the Viterbi path by performing the forward pass in log space with dynamic programming and returning both the omega matrix and the backtracking pointer matrix.
- Parameters:
a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V (numpy array.) – Vector of observed states (as integer indices).
order (list.) – List of indices mapping observed states to emission probabilities.
- Returns:
Tuple containing the omega matrix and the backtracking pointer matrix.
- Return type:
tuple(numpy array, numpy array).
- itrails.optimizer.viterbi_old(a, b, pi, V, order)[source]¶
Computes the Viterbi path using an iterative approach in log space by performing dynamic programming over the given observed sequence.
- Parameters:
a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V (numpy array.) – Vector of observed states.
order (list.) – List of indices mapping observed states to emission probabilities.
- Returns:
Viterbi path as an array of state indices.
- Return type:
numpy array.
- itrails.optimizer.viterbi_wrapper(a, b, pi, V_lst)[source]¶
Wrapper for the Viterbi algorithm that builds an order list using get_idx_state, applies the viterbi and backtrack_viterbi functions to each observed state vector in V_lst, and returns a list of Viterbi paths.
- Parameters:
a (numpy array.) – Transition probability matrix.
b (numpy array.) – Emission probability matrix.
pi (numpy array.) – Vector of starting probabilities of the hidden states.
V_lst (list[np.ndarray].) – List of observed state vectors (as numpy arrays of integer indices).
- Returns:
List of Viterbi paths corresponding to each observed sequence in V_lst.
- Return type:
list.