Pyxis Lipidomics: Biological Interpretation Guide

Apr 20

1. Introduction

Pyxis lipidomics provides MS/MS-based lipid identification using our Large Spectral Model for AI-powered de novo prediction. This guide helps you interpret Pyxis lipid results for biological analysis — including what the outputs mean and where known limitations require caution.

2. Understanding Lipid Classification Levels (L1–L3)

Pyxis reports lipid identifications at three hierarchical levels, derived from LIPID MAPS:

L1 — Lipid class (e.g., PC, PE, TAG, SM)
L2 — Sum composition (e.g., PC 36:4) — head group + total carbons and double bonds
L3 — Molecular species with bond type (e.g., PE P-36:4)

"Uncategorized" lipids: Some identifications appear under an "Uncategorized" section in the Lipids tab. They are very likely lipids or lipid-like molecules, but could not be automatically placed into the LIPID MAPS classification hierarchy. This can happen because LIPID MAPS is not fully comprehensive. Some genuine lipids are absent from its database.

Tip: If you need to map Pyxis output to external databases, note that there is a many-to-one relationship: multiple analytes can map to a single L1/L2/L3 combination.

3. Scores

Each lipid ID is assigned a score: High, Medium, or Low.

The score is based on the maximum structural similarity score across all samples for a given identification
The same thresholds apply to both retrieval (library-matched) and de novo (AI-predicted) identifications; however, for the most rigorous comparisons, compare like to like: retrieval-to-retrieval and de novo-to-de novo.
A high score reflects structural match quality, but does not resolve all ambiguities (see Isomers section below)

4. Isomer Limitations

4a. Structural isomers (acyl chain composition)

Isomers with the same sum composition but different individual acyl chains are collapsed to a single sum-composition name. For example:

PE(10:0_18:0) and PE(12:0_16:0) are both reported as PE 28:0

Pyxis does not resolve sn-position or individual chain-length composition.

4b. Ether lipids — Plasmenyl (P-) vs. Plasmanyl (O-)

This is a key known limitation. Pyxis reports ether lipid bond type at L3 (e.g., PE P-36:4 vs. PE O-36:5), but cannot reliably distinguish between:

Plasmenyl (P-) — vinyl ether linkage (plasmalogen)
Plasmanyl (O-) — alkyl ether linkage

These species often share the same precursor mass and head group. The sn-1 neutral loss fragment that would distinguish them is frequently too weak to contribute meaningfully to the score.

Internal benchmarking shows that P- vs. O- assignment is essentially a coin flip, even at high confidence scores. Therefore, we recommend grouping them together as "ether" lipids rather than interpreting the P-/O- distinction as meaningful in any downstream analysis.

A note on chromatography: Plasmenyl (P-) and plasmanyl (O-) species can often be separated chromatographically due to fundamental differences in hydrophobicity. In principle, RT differences could help disambiguate them.

5. Pooled MS/MS samples — Current Limitations

A common study design collects MS/MS on pooled samples (for identification) and MS1 on individual samples (for quantitation and group comparisons).

Current limitation: Pyxis provides identifications only for MS/MS files. There is currently no mechanism to:

Extract peak areas from MS1-only data
Backfill lipid IDs from pools onto individual samples

This means differential abundance analysis (e.g., generating volcano plots comparing disease vs. healthy) requires MS/MS data on samples from each condition — not just on a pooled QC sample.

Workaround for pool-only designs: If your MS/MS data is limited to pools, you can use the precursor m/z (M/Z Min, M/Z Max columns — or the M/Z column when available) and retention time (First Observed RT, Last Observed RT columns) from the Pyxis CSV export to set up targeted extraction of those lipids in external software against your MS1-only sample files. This enables you to obtain peak areas for differential analysis across your individual samples, using Pyxis IDs as your target list.

We plan to address this limitation in an upcoming release, to remove this workaround, and provide peak areas natively in Pyxis.

6. Adduct Interpretation

Each identification includes a predicted adduct type (e.g., [M+H]+, [M+Na]+, [M-H2O+H]+)
Adduct information is not always displayed on XICs in the UI. To verify an ID, check that the reported adduct is consistent with the displayed precursor m/z for that compound class

7. Internal Calibrant Ions (Lock Mass)

If your instrument uses scan-to-scan internal calibration (e.g., fluoranthene at m/z 202.07 on Thermo Exploris), these calibrant ions appear in the raw spectra. This can lead to:

Unexpected peaks in mirror plots that don't appear in the vendor viewer
Potential false matches near calibrant m/z values

Practical advice: Be aware of your instrument's internal calibrant(s). If you see unexpected IDs at or near known calibrant m/z values, they may be artifacts.

8. Multi-Method and Multi-Column Data

If your study uses multiple LC methods (e.g., HILIC + RP, or C18 + C30):

Process each method as a separate sample set. Combining data from different chromatographic methods causes the same analyte to appear at very different retention times, making results difficult to interpret.
Cross-column discrepancies (e.g., a lipid appears significant on one column but not the other) can be a useful signal for flagging incorrect or isomeric IDs.

Equally, multi-polarity/ionization mode should provide an additional means of validating an ID. In this case, we expect a given ID to show the same RT in both modes. Furthermore, one should make the most out of having spectra from both polarities, as often these provide complementary, not redundant, information.

Sam Burian