Skip to content

Add MZMine metabolomics vignette and re-export MZMinetoMSstatsFormat#211

Open
swaraj-neu wants to merge 1 commit into
develfrom
MSstats/work/20260617_metabolomics_vignette
Open

Add MZMine metabolomics vignette and re-export MZMinetoMSstatsFormat#211
swaraj-neu wants to merge 1 commit into
develfrom
MSstats/work/20260617_metabolomics_vignette

Conversation

@swaraj-neu

@swaraj-neu swaraj-neu commented Jun 18, 2026

Copy link
Copy Markdown

Motivation and Context

Please include relevant motivation and context of the problem along with a short summary of the solution.

Changes

Please provide a detailed bullet point list of your changes.

Testing

Please describe any unit tests you added or modified to verify your changes.

Checklist Before Requesting a Review

  • I have read the MSstats contributing guidelines
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules
  • I have run the devtools::document() command after my changes and committed the added files

Motivation and Context

This pull request extends the MSstats package to support metabolomics workflows using MZMine data, which is a popular open-source software for LC-MS feature detection and annotation. The MSstats package previously focused on proteomics, but this PR adds capability to handle untargeted metabolomics data by leveraging the MZMinetoMSstatsFormat converter function from the MSstatsConvert package. The change maintains backwards compatibility by re-exporting the converter function through the MSstats namespace, allowing users to perform complete differential abundance analysis workflows on metabolomics data without needing to explicitly load the MSstatsConvert package.

Detailed Changes

  • NAMESPACE file: Added export(MZMinetoMSstatsFormat) to expose the re-exported converter function (line 26)

  • R/converters.R: Added roxygen2 export and import directives for MZMinetoMSstatsFormat with #' @export`` and #' @importFrom` MSstatsConvert MZMinetoMSstatsFormat` documentation comments followed by the re-export statement `MSstatsConvert::MZMinetoMSstatsFormat` (lines 21-22), following the established pattern used for other source-specific converters (DIANN, DIAUmpire, FragPipe, MaxQ, OpenMS, OpenSWATH, PD, Progenesis, Skyline, Spectronaut)

  • man/reexports.Rd: Updated the roxygen2-generated re-exports documentation page to include MZMinetoMSstatsFormat in the list of exported converters with a new \alias{MZMinetoMSstatsFormat} entry and updated the MSstatsConvert item description to include a cross-reference link to the function

  • vignettes/MSstatsMetabolomics.Rmd: Added a comprehensive new vignette documenting an end-to-end metabolomics workflow that demonstrates:

    • Loading example MZMine feature quantification data, sample annotations, spectral library matches, and SIRIUS structure identifications
    • Converting MZMine data to MSstats format using MZMinetoMSstatsFormat with combined evidence from MZMine compound names (MSI Level 2) and SIRIUS structure predictions (MSI Level 3)
    • Summarizing features to compound-level abundance using dataProcess with log2 transformation, equalized medians normalization, TMP summarization, and model-based imputation
    • Testing for differential abundance across conditions using groupComparison with pairwise contrasts
    • Visualizing results with profile plots and volcano plots using dataProcessPlots and groupComparisonPlots
    • Including an explicit caveat about lactate's unreliable statistical result due to missing measurements in the small fixture dataset

Unit Tests

The vignette itself serves as an executable test demonstrating the complete workflow. The code chunks in MSstatsMetabolomics.Rmd load data, execute the conversion, summarization, and comparison functions, and generate visualizations, providing functional validation of the re-exported MZMinetoMSstatsFormat function and its integration with the existing MSstats analysis pipeline.

Coding Guidelines

All changes follow the established patterns and conventions in the MSstats codebase:

  • The re-export in R/converters.R follows the identical structure of existing converter re-exports (DIANNtoMSstatsFormat, DIAUmpiretoMSstatsFormat, etc.), maintaining consistency with the backwards-compatibility layer
  • roxygen2 documentation comments use standard #' @export and `#' `@importFrom directives
  • The vignette uses BiocStyle markdown formatting consistent with other MSstats documentation
  • The vignette includes knitr::opts_chunk settings for figure sizing and output options consistent with MSstats conventions
  • Code examples use the data.table::fread() function for data loading, matching patterns used elsewhere in the codebase

@swaraj-neu swaraj-neu requested a review from tonywu1999 June 18, 2026 02:49
@swaraj-neu swaraj-neu self-assigned this Jun 18, 2026
@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

MZMinetoMSstatsFormat from MSstatsConvert is added as a re-exported symbol in MSstats via NAMESPACE, R/converters.R, and man/reexports.Rd. A new vignette MSstatsMetabolomics.Rmd documents an end-to-end metabolomics workflow using MZMine and SIRIUS outputs through conversion, summarization, differential testing, and visualization.

Changes

MZMinetoMSstatsFormat re-export and metabolomics vignette

Layer / File(s) Summary
Re-export wiring for MZMinetoMSstatsFormat
R/converters.R, NAMESPACE, man/reexports.Rd
Adds @export/@importFrom binding in R/converters.R, the corresponding export() and importFrom() directives in NAMESPACE, and a new alias plus item link in man/reexports.Rd.
Metabolomics workflow vignette
vignettes/MSstatsMetabolomics.Rmd
New vignette covering introduction, example data loading from MSstatsConvert, conversion via MZMinetoMSstatsFormat, summarization with dataProcess, differential testing with groupComparison, and profile/volcano plots with dataProcessPlots/groupComparisonPlots.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant MSstatsConvert
  participant MSstats

  User->>MSstatsConvert: system.file() — retrieve MZMine, annotation, library, SIRIUS CSVs
  User->>MSstatsConvert: MZMinetoMSstatsFormat(mzmine, annotation, library, sirius)
  MSstatsConvert-->>User: MSstats-format data frame
  User->>MSstats: dataProcess(converted, logTrans=2, normalization="equalizeMedians", MBimpute=TRUE)
  MSstats-->>User: FeatureLevelData + ProteinLevelData
  User->>MSstats: groupComparison(contrast.matrix, summarized)
  MSstats-->>User: ComparisonResult (log2FC, pvalue, adj.pvalue, issue)
  User->>MSstats: dataProcessPlots(type="ProfilePlot")
  User->>MSstats: groupComparisonPlots(type="VolcanoPlot", eval=FALSE)
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 A MZMine path now hops into view,
Re-exported with care, fresh and new.
The vignette unfolds, step by step it goes,
From features to proteins, the workflow flows.
With caffeine and lactate, the rabbit takes note —
MSstats for metabolites, worthy of quote! 🌿

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description template was not filled in; all sections remain as empty placeholders and the checklist items are unchecked, indicating the author did not provide required context, specific changes, or testing details. Fill in all required sections: add motivation/context for the vignette, list specific changes (NAMESPACE export, re-export in converters.R, documentation update, and vignette addition), describe testing performed, and check completed items in the pre-review checklist.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the two main changes: adding a MZMine metabolomics vignette and re-exporting the MZMinetoMSstatsFormat function, matching the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch MSstats/work/20260617_metabolomics_vignette

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

Copy link
Copy Markdown

Failed to generate code suggestions for PR

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
vignettes/MSstatsMetabolomics.Rmd (1)

39-39: 💤 Low value

Consider adding a reference for the MSI levels citation.

Line 39 references "Sumner et al., 2007" for the MSI (Metabolomics Standards Initiative) identification levels. While informal citations are acceptable in vignettes, adding a brief reference or URL would help readers locate the source document if they want to learn more about the classification system.

📚 Optional addition

You could add a references section at the end:

## __References__

Sumner, L.W., Amberg, A., Barrett, D., et al. (2007). Proposed minimum reporting standards for chemical analysis. _Metabolomics_, 3, 211-221. https://doi.org/10.1007/s11306-007-0082-2

Or simply add the DOI inline:

-  correspond to MSI Level 2 putative identifications (Sumner et al., 2007).
+  correspond to MSI Level 2 putative identifications (Sumner et al., 2007, https://doi.org/10.1007/s11306-007-0082-2).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@vignettes/MSstatsMetabolomics.Rmd` at line 39, Add a formal reference for the
Sumner et al., 2007 citation mentioned in relation to MSI Level 2 identification
levels. Either create a References section at the end of the vignette document
with the complete citation details (authors, year, title, journal, volume,
pages, and DOI), or add the DOI inline with the existing citation at the
location where MSI levels are mentioned. This will help readers locate the
source document and understand the classification system better.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@vignettes/MSstatsMetabolomics.Rmd`:
- Line 39: Add a formal reference for the Sumner et al., 2007 citation mentioned
in relation to MSI Level 2 identification levels. Either create a References
section at the end of the vignette document with the complete citation details
(authors, year, title, journal, volume, pages, and DOI), or add the DOI inline
with the existing citation at the location where MSI levels are mentioned. This
will help readers locate the source document and understand the classification
system better.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 216978bb-d16b-4477-bf79-31e524d04a99

📥 Commits

Reviewing files that changed from the base of the PR and between 5b07ebb and 0952095.

📒 Files selected for processing (4)
  • NAMESPACE
  • R/converters.R
  • man/reexports.Rd
  • vignettes/MSstatsMetabolomics.Rmd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant