lunapi.gpa¶
lunapi.gpa — Python interface to Luna’s GPA association-analysis commands.
Two underlying Luna commands are exposed:
- --gpa-prep
build a binary data matrix from tabular input files
- --gpa
run linear association models on that matrix
Both are invoked in-process via the lunapi0 C++ bindings (no subprocess).
Functions¶
|
Run |
|
Return the variable manifest for a |
|
Run GPA association analysis against a pre-built |
|
Dump the raw data matrix from a |
|
Return (ids, x_resid, y_resid) after regressing zvars out of both axes. |
|
Return (ids, x_vals, y_vals) from the cached GPA analysis matrix. |
Release the cached GPA analysis matrix to free memory. |
Module Contents¶
- lunapi.gpa.gpa_prep(dat_path: str, specs: List[dict] | None = None, specs_path: str | None = None) str[source]¶
Run
--gpa-prepto build a binary GPA data matrix.Exactly one of specs or specs_path should be supplied.
- Parameters:
dat_path (str) – Output path for the binary
.datfile.specs (list[dict] or None) – Structured input-file specification list. Each dict may contain
file,group,vars,facs,fixed,mappings. The list is serialised to a temporary JSON file and passed asspecs=<tmpfile>to--gpa-prep.specs_path (str or None) – Path to an existing JSON specs file.
- Returns:
Manifest text captured from stdout (tab-delimited, same columns as
gpa_manifest()output). Empty if no manifest was produced.- Return type:
str
- Raises:
RuntimeError – Propagated from
Helper::halt()inside the Luna C++ library.
- lunapi.gpa.gpa_manifest(dat_path: str) pandas.DataFrame[source]¶
Return the variable manifest for a
.datfile as a DataFrame.Runs
--gpa manifestand parses the tab-delimited stdout.Columns always include
NV,VAR,NI,GRP,BASE, plus any factor columns present in the dataset (e.g.CH,F,SS).
- lunapi.gpa.gpa_run(dat_path: str, X: str | List[str] | None = None, Y: str | List[str] | None = None, Z: str | List[str] | None = None, Xg: str | List[str] | None = None, Yg: str | List[str] | None = None, Zg: str | List[str] | None = None, mode: str = 'assoc', nreps: int = 0, fdr: bool = True, bonf: bool = False, holm: bool = False, fdr_by: bool = False, adj_all_x: bool = False, x_factors: bool = False, p: float | None = None, padj: float | None = None, vars: str | None = None, xvars: str | None = None, grps: str | None = None, xgrps: str | None = None, facs: str | None = None, xfacs: str | None = None, faclvls: str | None = None, xfaclvls: str | None = None, n_prop: float | None = None, n_req: int | None = None, knn: int | None = None, winsor: float | None = None, subset: str | None = None, inc_ids: str | None = None, ex_ids: str | None = None, verbose: bool = False) Dict[str, pandas.DataFrame][source]¶
Run GPA association analysis against a pre-built
.datfile.- Parameters:
dat_path (str) – Binary data file created by
gpa_prep().X (str | list[str] | None) – Predictor, outcome, and covariate variable names. Lists are joined with commas and passed as a single
X=a,b,cargument.Y (str | list[str] | None) – Predictor, outcome, and covariate variable names. Lists are joined with commas and passed as a single
X=a,b,cargument.Z (str | list[str] | None) – Predictor, outcome, and covariate variable names. Lists are joined with commas and passed as a single
X=a,b,cargument.Xg (str | list[str] | None) – Group-based variable selection (predictor, outcome, covariate groups).
Yg (str | list[str] | None) – Group-based variable selection (predictor, outcome, covariate groups).
Zg (str | list[str] | None) – Group-based variable selection (predictor, outcome, covariate groups).
mode ("assoc" | "stats" | "comp") –
"assoc"— linear association models (default)"stats"— descriptive statistics only"comp"— comparison-style enrichment tests
nreps (int) – Permutation replicates (0 = asymptotic p-values only).
fdr (bool) – Apply FDR(B&H) correction (default True; pass
fdr=Falseto disable).bonf (bool) – Additional multiple-testing corrections to add to the output.
holm (bool) – Additional multiple-testing corrections to add to the output.
fdr_by (bool) – Additional multiple-testing corrections to add to the output.
adj_all_x (bool) – Adjust p-values across all X variables jointly instead of per-X.
x_factors (bool) – Append X-variable manifest columns (XBASE, XGROUP, XSTRAT) to output.
p (float | None) – Only return rows below this nominal or adjusted significance threshold.
padj (float | None) – Only return rows below this nominal or adjusted significance threshold.
vars (str | None) – Explicit variable include / exclude lists (comma-separated).
xvars (str | None) – Explicit variable include / exclude lists (comma-separated).
grps (str | None) – Group include / exclude lists.
xgrps (str | None) – Group include / exclude lists.
facs (str | None) – Factor include / exclude lists.
xfacs (str | None) – Factor include / exclude lists.
faclvls (str | None) – Factor-level include / exclude filters (
CH/FZ|CZsyntax).xfaclvls (str | None) – Factor-level include / exclude filters (
CH/FZ|CZsyntax).n_prop (float | None) – Drop columns with more than this proportion of missing values.
n_req (int | None) – Drop columns with fewer than this many non-missing values.
knn (int | None) – k for kNN imputation of missing values.
winsor (float | None) – Winsorisation proportion applied before modelling.
subset (str | None) – Include only subjects positive for these variables (
+VARsyntax).inc_ids (str | None) – Comma-separated subject ID include / exclude lists.
ex_ids (str | None) – Comma-separated subject ID include / exclude lists.
verbose (bool)
- Returns:
Keys follow
"GPA: STRATA"convention, e.g.:"GPA: X,Y"— main association results"GPA: VAR"— descriptive statistics (mode=”stats”)"GPA: X"— comparison test results (mode=”comp”)- Return type:
dict[str, pd.DataFrame]
- lunapi.gpa.gpa_dump(dat_path: str, **filter_opts) pandas.DataFrame[source]¶
Dump the raw data matrix from a
.datfile as a DataFrame.Any keyword argument is forwarded as a Luna parameter string, e.g.
X="male",lvars="PSD_CH_CZ_F_13.5".
- lunapi.gpa.gpa_get_xy_partial(xvar: str, yvar: str, zvars: List[str])[source]¶
Return (ids, x_resid, y_resid) after regressing zvars out of both axes.
Uses the same Rz = I - Z(Z’Z)^{-1}Z’ projection as the GPA linear model, so the residual scatter exactly matches what went into the regression. Falls back to
gpa_get_xy()when zvars is empty.Raises
RuntimeErrorif no matrix is cached (callgpa_run()first).
- lunapi.gpa.gpa_get_xy(xvar: str, yvar: str)[source]¶
Return (ids, x_vals, y_vals) from the cached GPA analysis matrix.
Filters to rows where both xvar and yvar are non-NaN — the exact same subjects used in the most recent
gpa_run()call.Raises
RuntimeErrorif no matrix is cached (callgpa_run()first).