STAVAG.keep_variant_genes

STAVAG.keep_variant_genes(coord_dict_raw, coord_dict_rand, n_dim, threshold=0.05, keys=None)[source]

Filter genes whose observed importance exceeds a random baseline.

For each axis this keeps rows where Importance is greater than a high percentile of the random importance distribution.

Parameters:
  • coord_dict_raw (Dict[str, DataFrame]) – Dict of DataFrames per axis with importance values.

  • coord_dict_rand (Dict[str, ndarray]) – Dict of random importance arrays per axis.

  • n_dim (int) – Number of coordinate dimensions.

  • threshold (float) – Significance level. For example 0.05 targets the top tail of the random distribution.

  • keys (Sequence[str] | None) – Optional explicit axis names.

Returns:

Filtered dict with the same structure as coord_dict_raw.

Return type:

Dict[str, DataFrame]