lunapi.gpa¶

lunapi.gpa — Python interface to Luna’s GPA association-analysis commands.

Two underlying Luna commands are exposed:

--gpa-prep

build a binary data matrix from tabular input files

--gpa

run linear association models on that matrix

Both are invoked in-process via the lunapi0 C++ bindings (no subprocess).

Functions¶

`gpa_prep`(→ str)	Run `--gpa-prep` to build a binary GPA data matrix.
`gpa_manifest`(→ pandas.DataFrame)	Return the variable manifest for a `.dat` file as a DataFrame.
`gpa_run`(→ Dict[str, pandas.DataFrame])	Run GPA association analysis against a pre-built `.dat` file.
`gpa_dump`(→ pandas.DataFrame)	Dump the raw data matrix from a `.dat` file as a DataFrame.
`gpa_get_xy_partial`(xvar, yvar, zvars)	Return (ids, x_resid, y_resid) after regressing zvars out of both axes.
`gpa_get_xy`(xvar, yvar)	Return (ids, x_vals, y_vals) from the cached GPA analysis matrix.
`gpa_clear_cache`()	Release the cached GPA analysis matrix to free memory.

Module Contents¶

lunapi.gpa.gpa_prep(dat_path: str, specs: List[dict] | None = None, specs_path: str | None = None) → str[source]¶

Run --gpa-prep to build a binary GPA data matrix.

Exactly one of specs or specs_path should be supplied.

Parameters:

dat_path (str) – Output path for the binary .dat file.
specs (list[dict] or None) – Structured input-file specification list. Each dict may contain file, group, vars, facs, fixed, mappings. The list is serialised to a temporary JSON file and passed as specs=<tmpfile> to --gpa-prep.
specs_path (str or None) – Path to an existing JSON specs file.

Returns:

Manifest text captured from stdout (tab-delimited, same columns as gpa_manifest() output). Empty if no manifest was produced.

Return type:

str

Raises:

RuntimeError – Propagated from Helper::halt() inside the Luna C++ library.

lunapi.gpa.gpa_manifest(dat_path: str) → pandas.DataFrame[source]¶

Return the variable manifest for a .dat file as a DataFrame.

Runs --gpa manifest and parses the tab-delimited stdout.

Columns always include NV, VAR, NI, GRP, BASE, plus any factor columns present in the dataset (e.g. CH, F, SS).

Run GPA association analysis against a pre-built .dat file.

Parameters:

dat_path (str) – Binary data file created by gpa_prep().
X (str | list[str] | None) – Predictor, outcome, and covariate variable names. Lists are joined with commas and passed as a single X=a,b,c argument.
Y (str | list[str] | None) – Predictor, outcome, and covariate variable names. Lists are joined with commas and passed as a single X=a,b,c argument.
Z (str | list[str] | None) – Predictor, outcome, and covariate variable names. Lists are joined with commas and passed as a single X=a,b,c argument.
Xg (str | list[str] | None) – Group-based variable selection (predictor, outcome, covariate groups).
Yg (str | list[str] | None) – Group-based variable selection (predictor, outcome, covariate groups).
Zg (str | list[str] | None) – Group-based variable selection (predictor, outcome, covariate groups).
mode ("assoc" | "stats" | "comp") –
- "assoc" — linear association models (default)
- "stats" — descriptive statistics only
- "comp" — comparison-style enrichment tests
nreps (int) – Permutation replicates (0 = asymptotic p-values only).
fdr (bool) – Apply FDR(B&H) correction (default True; pass fdr=False to disable).
bonf (bool) – Additional multiple-testing corrections to add to the output.
holm (bool) – Additional multiple-testing corrections to add to the output.
fdr_by (bool) – Additional multiple-testing corrections to add to the output.
adj_all_x (bool) – Adjust p-values across all X variables jointly instead of per-X.
x_factors (bool) – Append X-variable manifest columns (XBASE, XGROUP, XSTRAT) to output.
p (float | None) – Only return rows below this nominal or adjusted significance threshold.
padj (float | None) – Only return rows below this nominal or adjusted significance threshold.
vars (str | None) – Explicit variable include / exclude lists (comma-separated).
xvars (str | None) – Explicit variable include / exclude lists (comma-separated).
grps (str | None) – Group include / exclude lists.
xgrps (str | None) – Group include / exclude lists.
facs (str | None) – Factor include / exclude lists.
xfacs (str | None) – Factor include / exclude lists.
faclvls (str | None) – Factor-level include / exclude filters (CH/FZ|CZ syntax).
xfaclvls (str | None) – Factor-level include / exclude filters (CH/FZ|CZ syntax).
n_prop (float | None) – Drop columns with more than this proportion of missing values.
n_req (int | None) – Drop columns with fewer than this many non-missing values.
knn (int | None) – k for kNN imputation of missing values.
winsor (float | None) – Winsorisation proportion applied before modelling.
subset (str | None) – Include only subjects positive for these variables (+VAR syntax).
inc_ids (str | None) – Comma-separated subject ID include / exclude lists.
ex_ids (str | None) – Comma-separated subject ID include / exclude lists.
verbose (bool)

Returns:

Keys follow "GPA: STRATA" convention, e.g.: "GPA: X,Y" — main association results "GPA: VAR" — descriptive statistics (mode=”stats”) "GPA: X" — comparison test results (mode=”comp”)

Return type:

dict[str, pd.DataFrame]

lunapi.gpa.gpa_dump(dat_path: str, **filter_opts) → pandas.DataFrame[source]¶

Dump the raw data matrix from a .dat file as a DataFrame.

Any keyword argument is forwarded as a Luna parameter string, e.g. X="male", lvars="PSD_CH_CZ_F_13.5".

lunapi.gpa.gpa_get_xy_partial(xvar: str, yvar: str, zvars: List[str])[source]¶

Return (ids, x_resid, y_resid) after regressing zvars out of both axes.

Uses the same Rz = I - Z(Z’Z)^{-1}Z’ projection as the GPA linear model, so the residual scatter exactly matches what went into the regression. Falls back to gpa_get_xy() when zvars is empty.

Raises RuntimeError if no matrix is cached (call gpa_run() first).

lunapi.gpa.gpa_get_xy(xvar: str, yvar: str)[source]¶

Return (ids, x_vals, y_vals) from the cached GPA analysis matrix.

Filters to rows where both xvar and yvar are non-NaN — the exact same subjects used in the most recent gpa_run() call.

Raises RuntimeError if no matrix is cached (call gpa_run() first).

lunapi.gpa.gpa_clear_cache()[source]¶: Release the cached GPA analysis matrix to free memory.

lunapi.gpa¶

Functions¶

Module Contents¶

lunapi

Navigation

Related Topics