Review for "FMRIPrep: a robust preprocessing pipeline for functional MRI"

Completed on 20 May 2018 by Samuel A. Nastase .

Login to endorse this review.


Significance

Esteban and colleagues introduce fMRIPrep, a robust, state-of-the-art preprocessing pipeline for BIDS-formatted fMRI data. FMRIPrep uses Nipype to construct a workflow adapted to the structure of your dataset (specified in the BIDS metadata), drawing largely on existing analysis tools at each processing stage (e.g., FSL, ANTs, FreeSurfer, AFNI). I think FMRIPrep is a solution to an underappreciated problem: idiosyncratic lab-, and often study-specific in-house pipelines, which tend toward overfitting, require considerable manual intervention, and impede reproducibility. The manuscript is well-written and describes both the software and the motivations for using it well. I have a few comments—largely non-critical—related more to features of the software and documentation than the manuscript itself, as well as some language suggestions.


Comments to author

Author response.

In Figure 5, the histogram on the right was a bit confusing to me at first. If I understand correctly, the histogram relates to the vertical color bar, so the heavy tail for feat indicates more yellow (high variability) voxels in the brain mask(?). I assume that voxels are what’s being counted and binned in the histogram? Also why is the main peak of the distribution for fMRIPrep somewhat higher variability than the main peak for feat? An additional explanatory sentence in the caption could help clear this up.

We acknowledge the need for further clarification in the caption regarding the histogram. We have updated the caption accordingly as follows:

... within the brain mask. The color bar maps the bins of the histogram to the corresponding colors that are used in the left panel. The heavier tail of feat in the distribution of variability is compensated in the case of fMRIPrep by the peak in voxel-counts located at a higher variability than that of feat. Such peak in fMRIPrep’s distribution corresponds to variability in cortical gray matter regions.

I imagine a majority of researchers will be using fMRIPrep on a server at their institution (not locally), and therefore using Singularity. The “installation” documentation for this seems a bit arcane (or outdated), as something simple like “singularity build fmriprep.sqsh docker://poldracklab/fmriprep” does the trick.

We thank the reviewer for the valuable feedback. We have addressed the shortcomings of the documentation in the following Pull Request https://github.com/poldracklab/fmriprep/pull/1063.

Also related to using fMRIPrep on a remote server: I think for some people it’s not immediately obvious how to efficiently run a browser (locally) to view files living on a server. For example, one way is to use Mac OSX Finder’s Go > “Connect to Server…” functionality, but there are several ways—might be worthwhile to provide a pointer to help new users past this potential hurdle.

In total agreement with this suggestion, we have now updated our documentation regarding this topic: http://fmriprep.readthedocs.io/en/latest/usage.html#not-running-on-a-local-machine-data-transfer

What are reasons for NOT using fMRIPrep? (Aside from the issues you mention at line 328.) I genuinely tried to think hard on this, but all I could come up with is: 1. If you really want unlimited flexibility (which is obviously a double-edged sword); 2. If you want students to suffer through implementing each step for didactic purposes, or to learn shell-scripting or Python along the way; 3. If you’re trying to reproduce some in-house lab pipeline. I’m not sure any of these are particularly good reasons, but it’s a question I’ve been asked—useful to consider.

In total agreement with this suggestion, we have now updated our documentation regarding this topic: http://fmriprep.readthedocs.io/en/latest/index.html#limitations-and-reasons-not-to-use-fmriprep

It’d be nice to estimate the prospective memory footprint (and/or runtime) when initializing fMRIPrep. Not sure if this is feasible (or if it’s already happening and I just didn’t realize), but it seems possible given how stereotyped the pipeline is and how much is already specified in the BIDS metadata. Would be useful when submitting jobs that require you to provision these things ahead of time (e.g., via Slurm).

Currently, fMRIPrep provides the --mem_mb flag to set an upper bound to the memory consumption at any given time. This feature is possible thanks to several contributions to nipype triggered by the development of fMRIPrep, with the objective of improving resource profiling and management.

The early steps of fMRIPrep estimate the file size of the heaviest BOLD time-series across all sessions and runs, with the exact purpose suggested by the reviewer. Additionally, the most memory-consuming nodes are tagged with the amount of physical memory they require, which enables the prospective memory footprint. However, for a complete estimation, it would be necessary to footprint virtual memory usage of these memory-heavy nodes within the workflow: https://github.com/poldracklab/fmriprep/issues/857

Some users have also focused on this issue from different perspectives, as demonstrated in NeuroStars.org: https://neurostars.org/t/how-much-ram-cpus-is-reasonable-to-run-pipelines-like-fmriprep/1086/5

Any plans to incorporate options for study-specific anatomical template construction or EPI template normalization (via Calhoun et al., 2017)? Never tried these personally, but may be of interest to some.

Indeed, we plan to add this feature in the near future: https://github.com/poldracklab/fmriprep/issues/620

More BIDS-related, but seems like the automated methods section (plus references) could also basically write the “Image acquisition” section based on BIDS metadata.

This is an excellent suggestion. This feature has been already implemented in the pybids library it will be incorporated into fMRIPrep in the near future: https://github.com/poldracklab/fmriprep/issues/1176

Maybe I missed this, but is the code for benchmarking fMRIPrep against FSL’s feat (plus flame) available online somewhere? (i.e., relating to Figures 5 and 6)

The analysis workflows of the experiment are linked in page 9 (see footnote) of the Online Methods document (supplemental materials). The final visualizations are also linked from the Online Methods document (supplemental materials). For convenience, the following link (https://github.com/poldracklab/fmriprep-notebooks/tree/master) gives access to the python notebooks used in this analysis.

Line 74: “fMRI for analysis” > “fMRI data for analysis” reads better to me

Thank you for the suggestion, it has been included.

Line 79: “composed by” > “composed of”

Thank you for the suggestion, it has been included.

Table 1: “State-of-art” > “State-of-the-art”

Thank you for the suggestion, it has been included.

Line 123: I *think* you want the word “template” here instead of “atlas”

Thank you for the suggestion, it has been included.

Line 143: “Such cortical mask” > “Such a cortical mask” or “This cortical mask”

Thank you for the suggestion, it has been included.

Line 180: Not sure it’s fair to say afni_proc.py only runs on volumetric data. It can also operate on surfaces (example 8 in afni_proc.py documentation), but you have to provide existing FreeSurfer surfaces—it doesn’t invoke FreeSurfer to construct them for you.

The reviewer is right, this claim is unfair. We have updated the text, which now reads as follows:

Conversely, C-PAC and feat are volume-based only. Although afni_proc.py is volume-based by default, pre-reconstructed surfaces can be manually set for sampling the BOLD signal prior to analysis.

Line 194: “implements “fieldmap-less” SDC to the BOLD images” sounds strange to me. Maybe “applies … to the ...” instead

Thank you for the suggestion, it has been included.

Line 209: “proposed as a mean” > “proposed as a means”

Thank you for the suggestion, it has been included.

Line 285: “without smoothing step” > “without the smoothing step”

Thank you for the suggestion, it has been included.

Figure 3: “robuster” > “more robust” in bottom example text

Thank you for the suggestion, it has been included.

Figure 6 caption: “fMRIPrep allows the researcher for a finer control” > “fMRIPrep affords the researcher finer control”

Thank you for the suggestion, it has been included.

Thanks to Matteo Visconti di Oleggio Castello, Feilong Ma, and Yarik Halchenko for help using fMRIPrep, as well as Mark Pinsk and members of the Hasson and Norman Labs at Princeton for helpful discussion. NB: I’m an AFNI user and less familiar with the FSL and ANTs functionality.

Calhoun, V. D., Wager, T. D., Krishnan, A., Rosch, K. S., Seymour, K. E., Nebel, M. B., ... & Kiehl, K. (2017). The impact of T1 versus EPI spatial normalization templates for fMRI data analyses. Human Brain Mapping, 38(11), 5331-5342.