Conversation
Add hotfix DIMS/v3.3.1 to main
ALuesink
left a comment
There was a problem hiding this comment.
General note: check the linter on the files changed.
| } | ||
| } | ||
| peakgroup_list <- cbind(peakgroup_list[, 1:6], ppmdev = ppmdev, peakgroup_list[, 7:ncol(peakgroup_list)]) | ||
| peakgroup_list <- cbind(peakgroup_list[, 1:4], ppmdev = ppmdev, peakgroup_list[, 5:ncol(peakgroup_list)]) |
There was a problem hiding this comment.
Could the indices be changes to column names, for readability and if column orders change in the future?
There was a problem hiding this comment.
ppmdev column is already present in the peak grouplist, so this section is refactored. Referring to columns by number has been removed.
There was a problem hiding this comment.
Refactored functions collapse, merge_duplicate_rows and calculate_zscores combined into file preprocessing/collect_filled_functions.R
| } | ||
|
|
||
| # replace missing intensities with random values around threshold | ||
| if (!is.null(peakgroup_list)) { |
There was a problem hiding this comment.
Maybe move if not null statement to main script. Also missing else, what happens if it is null?
There was a problem hiding this comment.
I would like to keep the main script as 'clean' as possible, so I prefer to keep the if statement inside the function.
If peakgroup_list is null, nothing happens; a null object is saved and passed to the next step.
| int_cols <- which(colnames(peakgroup_list) %in% names(repl_pattern)) | ||
| peakgroup_list <- cbind(peakgroup_list, "avg.int" = apply(peakgroup_list[, int_cols], 1, mean)) | ||
|
|
||
| return(peakgroup_list) |
There was a problem hiding this comment.
Return is within the if-statement, what happens if peakgroup_list is null?
There was a problem hiding this comment.
See comment above; a null object is returned.
| source("../../preprocessing/fill_missing_functions.R") | ||
|
|
||
| # test fill_missing_intensities | ||
| testthat::test_that("missing values are corretly filled with random values", { |
There was a problem hiding this comment.
Add function name that is tested
There was a problem hiding this comment.
Function names have not been added in the test_that line for any other unit test; this is a good idea for the general standardization for version 3.5.
| test_peakgroup_list <- data.frame(matrix(NA, nrow = 4, ncol = 23)) | ||
| colnames(test_peakgroup_list) <- c("mzmed.pgrp", "nrsamples", "ppmdev", "assi_HMDB", "all_hmdb_names", | ||
| "iso_HMDB", "HMDB_code", "all_hmdb_ids", "sec_hmdb_ids", "theormz_HMDB", | ||
| "C101.1", "C102.1", "P2.1", "P3.1", | ||
| "avg.int", "assi_noise", "theormz_noise", "avg.ctrls", "sd.ctrls", | ||
| "C101.1_Zscore", "C102.1_Zscore", "P2.1_Zscore", "P3.1_Zscore") | ||
| test_peakgroup_list[, c(1)] <- 300 + runif(4) | ||
| test_peakgroup_list[, c(2, 3)] <- runif(8) | ||
| test_peakgroup_list[, "HMDB_code"] <- c("HMDB1234567", "HMDB1234567_1", "HMDB1234567_2", "HMDB1234567_7") | ||
| test_peakgroup_list[, "all_hmdb_ids"] <- paste(test_peakgroup_list[, "HMDB_code"], | ||
| test_peakgroup_list[, "HMDB_code"], sep = ";") | ||
| test_peakgroup_list[, "all_hmdb_names"] <- paste(test_peakgroup_list[, "assi_HMDB"], | ||
| test_peakgroup_list[, "assi_HMDB"], sep = ";") | ||
| test_peakgroup_list[, grep("C", colnames(test_peakgroup_list))] <- 1000 * (1:16) |
There was a problem hiding this comment.
Same code as test_sum_intensities_adducts.R. Maybe make a txt file with the test_peakgroup_list table and load the table in the test_that() function
There was a problem hiding this comment.
Done. test_sum_adducts.R will be modified in version 3.5.
…etics/CustomModules into feature/refactor_DIMS_FilMissing
Refactored FillMissing. 6 functions which were used in the old PeakFinding method have been replaced with 1 function, which has been moved to the preprocessing folder. Unit test was added for this function.
Identification of noise peaks has been removed; functions downstream in the pipeline which use the information on noise peaks have been modified.
The new FillMissing procedure is much simpler than before, while giving better results. All filled-in intensities are now centered around the same threshold value and intensities cannot be negative. The new method is also faster.