I would like to announce that I have written a blog post commenting on this paper: https://www.cwts.nl/blog?artic.... The blog post discusses the difficulty of distinguishing between the use of impact factors at the level of journals and at the level of individual papers.
In addition to the comments made in the blog post, I also would like to raise the following issue.
In my view, the skewness of citation distributions can be interpreted in different ways, with different implications for the use of impact factors. Let me give two interpretations:
(1) This interpretation starts from the idea that citations provide a reasonable reflection of the quality of papers. Therefore the fact that within a single journal there are large differences in the number of citations received by papers indicates that there are large differences in the quality of papers. Consequently, the impact factor of a journal doesn’t properly reflect the quality of individual papers in the journal.
(2) This interpretation combines two ideas. The first idea is that citations are weak indicators of the quality of papers. Papers of similar quality on average have a similar number of citations, but there is a large standard deviation. Due to all kinds of ‘distorting factors’, papers of similar quality may differ a lot in the number of citations they receive. The second idea is that journals manage reasonably well to carry out quality control. Therefore the papers published in a journal are of more or less similar quality, so the standard deviation of the quality of the papers in a journal is relatively small. It follows from these two ideas that the impact factor, which is the average number of citations of the papers in a journal, provides a reasonable reflection the quality of individual papers in the journal (especially if the journal is sufficiently large, so that the above-mentioned ‘distorting factors’ in the citations received by individual papers cancel out). The fact that some papers in a journal receive many more citations than others is not the result of quality differences but instead it results from citations being weak indicators of quality, so it results from the above-mentioned ‘distorting factors’. In this interpretation, impact factors are a stronger rather than a weaker indicator of the quality of individual papers than citation counts.
The interpretation that the authors seem to follow in their paper, and that for instance also seems to be followed in the DORA declaration, is the first one. However, the empirical results presented by the authors, showing that citation distributions are highly skewed, are compatible with both interpretations provided above. In the second interpretation, there is no reason to reject the use of IFs to assess individual papers in a journal. Therefore, if the authors want to reject the use of IFs for this purpose, I believe they need to provide an additional argument to make clear why the first interpretation is more reasonable than the second one. I do think that the first interpretation is indeed more reasonable than the second one, but a careful argument is needed to make clear why this is the case and on which assumptions this is based.