Update for plexDIA by ypriverol · Pull Request #81 · bigbio/quantms-utils

ypriverol · 2026-05-04T06:00:42Z

Summary by CodeRabbit

Release Notes

Documentation & Project Metadata
- Enhanced project discoverability with expanded keywords
- Updated project URLs (Homepage, Documentation, GitHub, Bug Tracker, PyPI)
Bug Fixes
- DIANN converter now handles Label column as optional
- Improved label mapping with case-insensitive support for SILAC and MTRAQ formats

Fix test_commands.py

fix: mTRAQ label

qodo-code-review · 2026-05-04T06:00:46Z

ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one.

coderabbitai · 2026-05-04T06:00:55Z

📝 Walkthrough

Walkthrough

The PR updates project metadata with additional keywords and revised URLs, modifies unified design format parsing to make Label and LabelType optional columns, and implements flexible label normalization via case-insensitive substring matching for SILAC and MTRAQ formats. The f_table construction now conditionally includes Label only when present and multi-valued.

Changes

Project Metadata Update

Layer / File(s)	Summary
Metadata Expansion `pyproject.toml`	Keywords expanded to include `"big data"`, `"sdrf"`, `"sample-metadata"`, and `"proteomics-pipeline"`; `[project.urls]` rewritten with `Homepage`, `Documentation`, `GitHub`, `"Bug Tracker"`, and `PyPI` replacing prior custom URL keys.

Unified Design Format Parsing

Layer / File(s)	Summary
Column Requirements `quantmsutils/diann/diann2msstats.py`	Unified design validation now requires only `Filename`, `Fraction`, `Sample`, `Condition`, and `BioReplicate`; `Label` and `LabelType` are no longer mandatory.
Label Normalization Logic `quantmsutils/diann/diann2msstats.py`	Multiplexing label normalization made conditional on `Label` column presence with multiple unique values; label mapping now uses case-insensitive substring matching against `Label` values for SILAC (`light`/`medium`/`heavy` → `L/M/H`) and MTRAQ (`0`/`4`/`8` → `0/4/8`).
Fraction Table Construction `quantmsutils/diann/diann2msstats.py`	`f_table` conditionally includes `Label` column only when `Label` exists and has multiple values; otherwise omitted from table output.
Test Update `tests/test_commands.py`	`test_unified_format_validates_sample_consistency` expanded to write `Label` and `LabelType` columns in the test unified-format file to match updated schema expectations.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Mirnor changes for plexDIA and msstats #78: Modifies label and multiplexing handling in diann2msstats.py's unified design parsing and f_table construction.
bug fixing for quantms-utils and .d conversion to parquet information. #66: Adds the diann2msstats module with get_exp_design_dfs and _parse_unified_design logic that this PR refines.
Changes DIANN convert to MSStats #67: Adjusts unified design parsing with required column changes building on the foundation from that PR.

Suggested reviewers

daichengxin
jpfeuffer

Poem

🐰 Whiskers twitch with metadata delight,
Labels dance conditional in the light,
SILAC hops and MTRAQ sings true,
Optional now, what freedom to pursue! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'Update for plexDIA' is vague and does not clearly convey the actual changes made in the pull request, which involve updating project metadata, modifying unified design parsing logic to make Label optional, and updating tests.	Revise the title to be more specific about the main changes, such as 'Make Label column optional in DIANN unified design parsing' or 'Update project metadata and support optional label columns in DIANN conversion'.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch dev

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 0/1 reviews remaining, refill in 60 minutes.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codacy-production · 2026-05-04T06:01:30Z

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 0 complexity

Metric Results

Complexity 0

View in Codacy

_{NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer}
_{TIP This summary will be updated as you push new changes.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@quantmsutils/diann/diann2msstats.py`:
- Around line 185-199: The replacement is case-sensitive because
df["Label"].replace(...) uses original casing while detection uses labels_lower;
update the logic for both silac_dict and mtraq_dict to perform lookups against
labels_lower (which is already lowercased) using lowercase keys and then write
back mapped values while preserving non-matches (e.g., use
labels_lower.map(lowercased_dict).fillna(df["Label"]) or equivalent) so that
variables labels_lower, silac_dict/mtraq_dict, and df["Label"] are used for
case-insensitive replacement and original values are kept when there is no
match.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: da498a12-8aac-46b4-ab96-cc891245f736

📥 Commits

Reviewing files that changed from the base of the PR and between deff52d and 9fb4d1e.

📒 Files selected for processing (3)

pyproject.toml
quantmsutils/diann/diann2msstats.py
tests/test_commands.py

coderabbitai · 2026-05-04T06:07:54Z

+        if labels_lower.str.contains("silac").any():
            silac_dict = {
                "SILAC light": "L",
                "SILAC medium": "M",
                "SILAC heavy": "H",
            }
            df["Label"] = df["Label"].replace(silac_dict)

+        if labels_lower.str.contains("mtraq").any():
+            mtraq_dict = {
+                "MTRAQ0": "0",
+                "MTRAQ4": "4",
+                "MTRAQ8": "8",
+            }
+            df["Label"] = df["Label"].replace(mtraq_dict)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for Label column values in test design files to confirm expected casing rg -n --type tsv "mTRAQ\|MTRAQ\|SILAC\|silac" --glob "*.tsv" # Also check any usage in Python source that constructs or references these label strings rg -n "mTRAQ\|MTRAQ\|SILAC light\|silac" --type py

Repository: bigbio/quantms-utils

Length of output: 92

🏁 Script executed:

#!/bin/bash # Search for actual label values in TSV files (design files) find . -name "*.tsv" -type f | head -20

Repository: bigbio/quantms-utils

Length of output: 312

🏁 Script executed:

#!/bin/bash # Search for Label column values in any data files rg -i "mTRAQ|MTRAQ|SILAC" --glob "*.tsv"

Repository: bigbio/quantms-utils

Length of output: 46

🏁 Script executed:

#!/bin/bash # Check Python source for label string construction/references rg "mTRAQ|MTRAQ|SILAC|Label" -A 2 -B 2 --type py quantmsutils/diann/

Repository: bigbio/quantms-utils

Length of output: 4163

🏁 Script executed:

#!/bin/bash # Check the actual code context around lines 185-199 cat -n quantmsutils/diann/diann2msstats.py | sed -n '175,210p'

Repository: bigbio/quantms-utils

Length of output: 1731

🏁 Script executed:

#!/bin/bash # Check test design files for Label column content echo "=== PXD026600_diann_design.tsv ===" head -5 ./tests/test_data/diann2msstats/PXD026600_diann_design.tsv | cat -v echo -e "\n=== PXD026600.sdrf_openms_design.tsv ===" head -5 ./tests/test_data/diann2msstats/PXD026600.sdrf_openms_design.tsv | cat -v echo -e "\n=== Check for any mTRAQ or MTRAQ in test files ===" rg -i "label" ./tests/test_data/diann2msstats/*.tsv | head -20

Repository: bigbio/quantms-utils

Length of output: 4533

🏁 Script executed:

#!/bin/bash # Look at full test design file structure wc -l ./tests/test_data/diann2msstats/*.tsv

Repository: bigbio/quantms-utils

Length of output: 279

🏁 Script executed:

#!/bin/bash # Check convert-diann module to see what label values it produces find . -name "*convert*diann*" -o -name "*diann*convert*" | grep -i py rg "convert.*diann\|diann.*convert" -l --type py

Repository: bigbio/quantms-utils

Length of output: 46

🏁 Script executed:

#!/bin/bash # Search for "convert" and "diann" in module names and imports find . -type f -name "*.py" | xargs grep -l "convert.*diann\|quantmsdiann" | head -20

Repository: bigbio/quantms-utils

Length of output: 168

🏁 Script executed:

#!/bin/bash # Search broadly for SILAC and mTRAQ in entire repo rg -i "silac.*light|mtraq0|mtraq4|mtraq8" -B 2 -A 2

Repository: bigbio/quantms-utils

Length of output: 933

🏁 Script executed:

#!/bin/bash # Check what modules exist under quantmsutils/ ls -la quantmsutils/

Repository: bigbio/quantms-utils

Length of output: 633

🏁 Script executed:

#!/bin/bash # Look at the import statements and references in diann2msstats to understand data flow head -50 quantmsutils/diann/diann2msstats.py

Repository: bigbio/quantms-utils

Length of output: 1577

🏁 Script executed:

#!/bin/bash # Search for where Label column gets populated in convert-diann or related modules rg "Label.*=.*mTRAQ|Label.*=.*SILAC|\"mTRAQ|\"MTRAQ|\"SILAC" --type py -B 3 -A 3

Repository: bigbio/quantms-utils

Length of output: 1640

🏁 Script executed:

#!/bin/bash # Check if there's any test or documentation about multiplexed designs find . -path ./node_modules -prune -o -name "*.md" -type f -exec grep -l "mTRAQ\|SILAC\|multiplex" {} \;

Repository: bigbio/quantms-utils

Length of output: 46

🏁 Script executed:

#!/bin/bash # Search for actual usage or test code that calls diann2msstats grep -r "diann2msstats\|exp_design" tests/ --include="*.py" | head -20

Repository: bigbio/quantms-utils

Length of output: 1636

🏁 Script executed:

#!/bin/bash # Check dianncfg.py to see how it generates or handles Label values for multiplexed experiments cat -n quantmsutils/diann/dianncfg.py | grep -A 10 -B 10 "mTRAQ"

Repository: bigbio/quantms-utils

Length of output: 871

🏁 Script executed:

#!/bin/bash # Search for any conversion or generation of Label column values with mTRAQ/SILAC rg "Channel.*mTRAQ|mTRAQ.*Channel|Label.*:" quantmsutils/diann/ --type py -B 2 -A 2

Repository: bigbio/quantms-utils

Length of output: 1521

🏁 Script executed:

#!/bin/bash # Check if there's documentation or comments about expected label format rg -i "label.*format|mtraq.*label|silac.*label" quantmsutils/ --type py -B 1 -A 1

Repository: bigbio/quantms-utils

Length of output: 46

🏁 Script executed:

#!/bin/bash # Get more context around the dianncfg.py TODO to understand current multiplex support status cat -n quantmsutils/diann/dianncfg.py | sed -n '95,125p'

Repository: bigbio/quantms-utils

Length of output: 1338

🏁 Script executed:

#!/bin/bash # Check if there are any other scripts or workflows that might generate Label columns with mTRAQ/SILAC find . -name "*.py" -type f -exec grep -l "mTRAQ0\|MTRAQ0\|SILAC light" {} \;

Repository: bigbio/quantms-utils

Length of output: 102

🏁 Script executed:

#!/bin/bash # Examine the entire get_exp_design_dfs function to understand data flow rg -A 80 "def get_exp_design_dfs" quantmsutils/diann/diann2msstats.py

Repository: bigbio/quantms-utils

Length of output: 3222

Case-sensitive replacement will silently no-op for mTRAQ labels once multiplex support is enabled.

The multiplexing code (lines 185-199) currently appears unreachable: dianncfg.py (line 106) marks multiplex support as a TODO and explicitly rejects multiplexed experiments (mTRAQ, TMT, iTRAQ, Dimethyl modifications) with an error message. However, the bug is real and will manifest once this feature is implemented.

When enabled, labels_lower holds the lowercased values (for detection), but both df["Label"].replace(silac_dict) and df["Label"].replace(mtraq_dict) match against the original casing of df["Label"].

SILAC: keys "SILAC light"/"SILAC medium"/"SILAC heavy" match the SDRF convention exactly — works only when the upstream file uses that exact casing.

mTRAQ: keys "MTRAQ0"/"MTRAQ4"/"MTRAQ8" are all-caps, but the standard notation used throughout the codebase (e.g., dianncfg.py) is "mTRAQ" (lowercase m). Once multiplex support is added and design files contain "mTRAQ0", the replacement will silently no-op — detection triggers, replacement fails to match, df["Label"] retains "mTRAQ0", and the downstream merge against DIA-NN's Channel values ("0"/"4"/"8") produces all-NaN rows that are dropped, yielding an empty MSstats output.

Fix: run the replacement on labels_lower (already lowercase) using lowercase dict keys to ensure case-insensitive matching.

🐛 Proposed fix

labels_lower = df["Label"].astype(str).str.lower() if labels_lower.str.contains("silac").any(): silac_dict = { - "SILAC light": "L", - "SILAC medium": "M", - "SILAC heavy": "H", + "silac light": "L", + "silac medium": "M", + "silac heavy": "H", } - df["Label"] = df["Label"].replace(silac_dict) + df["Label"] = labels_lower.map(silac_dict).fillna(df["Label"]) if labels_lower.str.contains("mtraq").any(): mtraq_dict = { - "MTRAQ0": "0", - "MTRAQ4": "4", - "MTRAQ8": "8", + "mtraq0": "0", + "mtraq4": "4", + "mtraq8": "8", } - df["Label"] = df["Label"].replace(mtraq_dict) + df["Label"] = labels_lower.map(mtraq_dict).fillna(df["Label"])

Using labels_lower.map(dict).fillna(df["Label"]) performs case-insensitive lookup while preserving the original value for labels that do not match any key.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@quantmsutils/diann/diann2msstats.py` around lines 185 - 199, The replacement is case-sensitive because df["Label"].replace(...) uses original casing while detection uses labels_lower; update the logic for both silac_dict and mtraq_dict to perform lookups against labels_lower (which is already lowercased) using lowercase keys and then write back mapped values while preserving non-matches (e.g., use labels_lower.map(lowercased_dict).fillna(df["Label"]) or equivalent) so that variables labels_lower, silac_dict/mtraq_dict, and df["Label"] are used for case-insensitive replacement and original values are kept when there is no match.

yueqixuan and others added 6 commits April 8, 2026 18:49

fix test_commands.py

e656e55

feat: enrich PyPI metadata with keywords and project URLs

0f09dd0

fix: mTRAQ label

541c426

update

330260c

Merge pull request #79 from yueqixuan/main

2469363

Fix test_commands.py

Merge pull request #80 from yueqixuan/dev

9fb4d1e

fix: mTRAQ label

coderabbitai Bot reviewed May 4, 2026

View reviewed changes

ypriverol merged commit 478d5f1 into main May 4, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update for plexDIA#81

Update for plexDIA#81
ypriverol merged 6 commits intomainfrom
dev

ypriverol commented May 4, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

qodo-code-review Bot commented May 4, 2026

Uh oh!

coderabbitai Bot commented May 4, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

codacy-production Bot commented May 4, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ypriverol commented May 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

qodo-code-review Bot commented May 4, 2026

Uh oh!

coderabbitai Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

codacy-production Bot commented May 4, 2026

Up to standards ✅

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ypriverol commented May 4, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 4, 2026 •

edited

Loading