Math in the time of coronavirus

While many businesses have seen significant downturns during the pandemic, several scholarly publishers have reported increases in submissions in the first part of 2020. There are various studies and articles about the phenomenon.  Meanwhile, at Mathematical Reviews, we have not noticed any significant increase in the number of publications in mathematics during 2020.  Note that we count publications and the announcements and studies just mentioned are counting submissions.  In what follows, I look at some data about publications in mathematics in 2020, with comparisons to 2019.  the data come from four sources: Dimensions, Web of Science, the Mathematical Reviews Database, and the arXiv.

A Scholarly Kitchen post by Christos Petrou links to announcements about exceptional growth in submissions from three publishers: SpringerNature reporting an 11% increase in submissions in the first six months, Elsevier reporting 25% growth for submissions to their subscription journals from January to September, and Wiley reporting a 13% increase for their fiscal year 2020, which ended April 30, 2020.   The large study by Squazzoni, Bravo, Grimaldo, Garcia-Costa, Farjam, and Mehmani looked at hundreds of thousands of submissions to Elsevier journals between February and May in the three years 2018, 2019, and 2020.  They found that the number of submissions to all Elsevier journals in February-May 2020 increased 58% over the number of submission in the comparable four-month period in 2019.  For health and medicine journals, the increase was 92%.   Their paper does not address submission rates in mathematics.  There are also reports that men’s productivity has increased more than women’s productivity during the pandemic.  The study by Squazzoni, et al., looked at this in detail – indeed it is their focus, finding a statistically significant negative effect on women’s submission rates in three of the four large research areas they studied.  Curiously, in some categories in the Squazzoni, et al. study, submissions by women went up, but those increases were far outpaced by the increases in submissions by men.  One hypothesis mentioned in the studies is that with many scientists unable to work in their laboratories, they are finding time to write papers.  This seems plausible, and would also explain why the phenomenon is not showing up in the mathematics literature.

Growth in publications in all subjects

The Scholarly Kitchen post by Christos Petrou mentioned above analyzed publication rates using data available via the free version of Dimensions, which is a tool for discovering and analyzing research developed by and available from Digital Science.  He restricted his analysis to looking only at publications from the ERA 2018 journal list, which is a curated list from the Australian Research Council.  Petrou considered further restrictions to publications from ERA 2018 journals that are also in the Directory of Open Access Journals (DOAJ) and to ERA 2018 journals not in the DOAJ.  He also restricted to ERA 2018 journals in the Nature Index.  (It seems that the Nature Index journals are a proper subset of the ERA 2018 list.)  The data I collected from Dimensions for Nature Index journals were very different from what Petrou reported.  I don’t know why.  In any case, I am omitting any analysis for Nature Index journals.

At the time of Petrou’s blog post (November 19, 2020), the year 2020 hadn’t finished yet, so he made forecasts about how the year would finish.  His data from Dimensions appear to have been from October 29, 2020, based on his descriptions of his low and high estimates.  Petrou concluded that there has been an unusual growth in publications in the sciences in 2020 and that there has not been a similar growth in the arts and humanities.  He based this on comparisons of the Compound Annual Growth Rate (CAGR) for the years 2013 to 2019 with the estimated growth rate for 2020.  The formula for the CAGR for this period is $(1/6)\ln(N_{2019}/N_{2013})$ where $N_{2019}$ is the count of publications in 2019 and $N_{2013}$ is the count of publications in 2013.  In just about every category, the estimates are well above the CAGR.

Now that the year is done, it is possible to use complete data, rather than estimates. Petrou’s estimates were generally high, which might reflect increased activity at the start of the year, followed by a slowing in the last couple of months.  Repeating his analysis with data from the full year, one can still come to similar conclusions in most cases:  even though the actual growth rates for 2020 were generally lower than the estimates, they were greater than the CAGR.

. . . . . . . . . .% change in 2020. . . . . . . . . .
All Subjects CAGR Low est. High est. Actual
ERA 2018 3.1% 17% 26% 7.5%
ERA 2018
and DOAJ
9.4% 16% 26% 17.5%
ERA 2018
not DOAJ
2.0% 17% 26% 5.3%

Looking at the growth rates for the last three years indicates that the growth rates for 2020 are certainly up, but they are not quite as anomalous as the comparison with the CAGR would indicate.  The growth rates for all of the last three years are above the CAGR.  The high growth for 2020 looks like it is part of a trend, rather than a deviation.

All Subjects CAGR % change in 2018 % change in 2019 % change in 2020
ERA 2018 3.1% 4.1% 4.6% 7.5%
ERA 2018 and DOAJ  9.4% 12.2% 15.6% 17.5%
ERA 2018 not DOAJ 2.0% 2.7% 2.5% 5.3%

Graphing the counts and the growth rates for several years shows that two of the rates have been increasing for the last three years.  The third category, ERA 2018 not DOAJ was up dramatically for 2020, but that rate has seen a lot of fluctuation since 2010.

Graph of counts by year and % change by year for all subjects from the ERA 2018 list of journals

Graph of counts and growth rates for All subjects in ERA 2018 and DOAJ list

Graph of counts and growth rates for All subjects in ERA 2018 not DOAJ list

Comment

I find it curious that Christos Petrou’s predictions for 2020 were so far above the actuals.  His method seems reasonable.  It is possible that there was a surge in the early part of the year, that died off.  Looking at the monthly data for submissions to the arXiv and comparing the same months in 2018, 2019, and 2020, mathematics (excluding cross-listings) had a large year-to-year increase (36.6%) for June 2020.  Physics had a large increase spread out over June and July 2020 (18.1% and 14.7%).  Other months had much smaller year-to-year increases, or even decreases.  Now, arXiv data represent submissions rather than publications, but they provide a hint.

It was pointed out by Gabor Schubert in the comments to Petrou’s post that you can query Crossref for publications by month using their API.  Crossref is an official DOI Registration Agency of the International DOI Foundation and has metadata about almost every online publication.  The publication date is provided to Crossref by the publisher.  The analysis by Petrou was intentionally focused on a strict subset of the scholarly literature.  The Crossref data, being from every publisher registered with Crossref, is from a much larger set.

Looking at the Crossref data for journal articles for January 1 to October 31 gives:

Jan to Oct Count % Change
2016 2,389,254
2017 2,529,461 5.9%
2018 2,719,713 7.5%
2019 2,932,805 7.8%
2020 3,189,889 8.8%

Here are the Crossref counts for the full years:

Jan to Dec Count % Change
2016 3,425,201
2017 3,607,850 5.3%
2018 3,870,301 7.3%
2019 4,181,142 8.0%
2020 4,449,468 6.4%

Indeed, the growth for the whole year was much lower than what the first ten months would have predicted.

Looking more closely at the Crossref data, one sees that monthly publication rates increase from January to December each year.  Indeed, the publication rate for the second half of any of the five years 2016 to 2020 is 37% to 39% higher than the publication rate for the corresponding first half of the year.  Publication rates in the last two months of the year tend to be about 1.8 times as high as average monthly publication rates in the same year.  The year 2020 was an exception in that the November and December publication rates were only 1.7 times as high as the average monthly publication rates for 2020.  In other words, it appears that Christos Petrou’s predictions for 2020 may have been thrown off by a smaller increase in November and December than usual.

Growth in publications in mathematics

Dimensions data

Let’s begin by looking at a subset of the data set used by Petrou.  Using the Dimensions data for publications just in mathematics (Field of Research = 01 Mathematical Sciences) gives noticeably different trends from the data for all subjects.

Mathematics CAGR % change in 2018 % change in 2019 % change in 2020
ERA 2018 2.3% 6.2% 7.1% 8.0%
ERA 2018 and DOAJ  5.6% 25.8% 24.7% 36.8%
ERA 2018 not DOAJ 1.9% 3.9% 4.7% 3.3%

Again, it is instructive to look at graphs of the counts and growth rates since 2010.  In each category, the growth rate for 2020 beats the CAGR, but looking at the trends in the growth rates, these do not look exceptional.

Graphs of % growth and counts for math using Dimensions data for all ERA 2018 journals

Graphs of % growth and counts for math using Dimensions data for all ERA 2018 journals that are also DOAJ journals

Graphs of % growth and counts for math using Dimensions data for all ERA 2018 journals that are not DOAJ journals

Once again, the growth rates in the various categories for 2020 are above the CAGR, but so are the growth rates for 2018 and 2019.

Web of Science data

I searched Web of Science to look for trends.  I restricted the search to their Science Citation Index Expanded (SCI-EXPANDED) collection, with document type = Article, and subject area = mathematics.  The results are in the following table:

Year Count Growth
rate
2010 48,596 -1.3%
2011 51,300 5.6%
2012 54,570 6.4%
2013 57,428 5.2%
2014 58,720 2.2%
2015 58,089 -1.1%
2016 57,876 -0.4%
2017 60,306 4.2%
2018 62,063 2.9%
2019 69,585 12.1%
2020 73,715 5.9%

The CAGR is 3.2%.  The growth for 2020 is not particularly remarkable within the time frame.  The exceptional year was 2019 with a 12.1% increase over 2018,  but that growth rate was not sustained in 2020, dropping by half.   The graph of the counts and growth rates is below:

Graphs of counts and growth rates of mathematics within Web of Science

For comparison, here are the data for physics in Web of Science.

Year Count Growth
rate
2010 118,671 0.4%
2011 126,177 6.3%
2012 126,845 0.5%
2013 132,505 4.5%
2014 133,093 0.4%
2015 133,948 0.6%
2016 134,985 0.8%
2017 136,760 1.3%
2018 142,437 4.2%
2019 146,906 3.1%
2020 149,329 1.6%

Counts and growth rates for physics articles from Web of Science

Mathematical Reviews data

At Mathematical Reviews, we receive journal articles, conference proceedings, and books in various ways.  Some are delivered to us electronically by the publishers.  Some we retrieve electronically.  Some are still delivered in paper.  I want to focus on the counts of journal articles being delivered to Mathematical Reviews by the publishers.  Our operations were interrupted by the pandemic.  Therefore, including the counts of papers we retrieved would introduce confounding variables, which I prefer to avoid.  I only have consistent data for these counts starting in 2017.  Below is a graph of the counts by quarter from 2017 to 2020.  It is evident from the graph that there were no extraordinary increases in 2020.

Delivered electronic articles to Math Reviews 2017 to 2020

The growth rates by year are

Year % change from
previous yr
2018 8.3%
2019 0.3%
2020 2.4%

 

arXiv data: Mathematics

Finally, let’s look at some data from the arXiv.  The arXiv is quite generous about making their data available.  In some cases, you can download a CSV file with data, or you can read it off tables.  There were some types of data, though, where I had to do repeated searches to obtain what I was after.  Even so, it is a fairly easy process.  One benefit of these data are that we can count submissions.

Looking at all subjects in the arXiv over time, the growth in submissions is clearly consistent and exponential.  There are confounding factors, though.  For instance, over the years, the subjects covered by the arXiv have expanded.  Even so, looking at the data since 2010, it does not seem that 2020 was extraordinary.  Indeed, last year’s growth rate was roughly the same as the growth rate in 2018.

Year Count Rate
2010 70,131 9.5%
2011 76,578 9.2%
2012 84,603 10.5%
2013 92,641 9.5%
2014 97,517 5.3%
2015 105,280 8.0%
2016 113,380 7.7%
2017 123,523 8.9%
2018 140,616 13.8%
2019 155,866 10.8%
2020 178,329 14.4%

Graph of submissions to the arXiv in all subjects from 2010 to 2020, including growth rates

Looking at counts of submissions in mathematics was more tedious.  The arXiv provides visual information here, but they do not provide a CSV file of the data.  Therefore, I used the arXiv’s advanced search feature, searching a year at a time.  It is possible to count papers in two ways: preprints that are only listed under mathematics and preprints that are in mathematics, but also cross-listed in another subject.  The data for both searches, as well as the annual growth rates, are in the following table.

arXiv math
Year counts, with cross lists growth rate, with cross lists counts, no cross lists growth rate, no cross lists
2009 14,928 13,919
2010 17,626 18.1% 15,958 14.6%
2011 20,202 14.6% 18,174 13.9%
2012 22,946 13.6% 20,483 12.7%
2013 25,742 12.2% 23,123 12.9%
2014 27,693 7.6% 25,059 8.4%
2015 30,002 8.3% 27,057 8.0%
2016 31,488 5.0% 28,512 5.4%
2017 33,407 6.1% 30,196 5.9%
2018 35,416 6.0% 32,019 6.0%
2019 38,055 7.5% 33,581 4.9%
2020 41,027 7.8% 35,371 5.3%
CAGR 6.5% 6.2%

Looking at the data, the counts and the growth rates for 2020 do not stand out.  Indeed, the CAGR for 2020 with cross listings is 6.5%, a bit below the observed rate for 2020.  The CAGR for 2020 without cross listings is 6.2%, which is actually higher than the observed rate for 2020.

Here is a graphical representation of the same data:

Graphs of counts and growth rates for math submissions in the arXiv from 2010 to 2020

arXiv data: Physics

For comparison, I looked at similar data for physics in the arXiv.   The growth rates for physics submissions to the arXiv, both with and without cross listings, were higher than recent years, and greater than the CAGRs for the two categories.  The growth rate for physics in the arXiv has a lot of variation, though, making it hard to conclude if 2020 was an exceptional year or just a very good year.

arXiv physics
Year counts, with cross lists growth rate, with cross lists counts, no cross lists growth rate, no cross lists
2009 45,567 44,253
2010 47,831 5.0% 45,989 3.9%
2011 51,115 6.9% 48,959 6.5%
2012 53,378 4.4% 50,825 3.8%
2013 55,844 4.6% 53,349 5.0%
2014 56,944 2.0% 54,412 2.0%
2015 59,742 4.9% 57,076 4.9%
2016 61,598 3.1% 58,659 2.8%
2017 62,557 1.6% 59,615 1.6%
2018 65,937 5.4% 62,807 5.4%
2019 69,009 4.7% 65,230 3.9%
2020 74,181 7.5% 69,434 6.4%
CAGR 3.5% 3.4%

Graphs of submissions to the arXiv in physics, both counts and annual growth rates

I also looked at monthly submission rates for mathematics and for physics for 2018 to 2020.  There are definitely peaks and valleys in the monthly data, with both mathematics and physics having peaks around June – July 2020.

Chart of monthly submissions to the arXiv in math and in physics for 2018 to 2020

bioRxiv data

I also looked at data from the bioRxiv.  Obtaining a list of monthly counts of submissions was fairly easy, using their API.  It is hard to interpret the data for this context, though.  For one thing, the bioRxiv has only been operating since November 2013.  Acceptance of the bioRxiv for preprints in biology and medicine is not yet as strong as the acceptance of the arXiv by mathematicians and physicists.   Mostly what the table shows is the growth of acceptance of the bioRxiv.   For completeness, here are the counts and the growth rates from the bioRxiv.

Year New papers Growth rate
2013 109
2014 886 713%
2015 1,774 100%
2016 4,721 166%
2017 11,347 140%
2018 20,783 83%
2019 29,181 40%
2020 38,717 33%

Conclusions

There does not seem to have been an extraordinary surge in mathematics papers in 2020 compared with previous years.  As mentioned already, it is possible that some disciplines had increases because researchers without access to their labs had more time available to write.  A survey of SpringerNature authors indicated that, as a result of lockdowns, 65% of them intended to re-use their own data.  About half of the respondents thought it is extremely or somewhat likely that the lockdown will result in the re-use of open data from other labs.  The majority of mathematicians do not need to generate data in a laboratory to work.  This may help to explain why mathematicians are not finding so much extra time to be writing papers during the pandemic.


Mathematical aside

Much of this post discusses changes to growth rates, i.e., second derivatives.

About Edward Dunne

I am the Executive Editor of Mathematical Reviews. Previously, I was an editor for the AMS Book Program for 17 years. Before working for the AMS, I had an academic career working at Rice University, Oxford University, and Oklahoma State University. In 1990-91, I worked for Springer-Verlag in Heidelberg. My Ph.D. is from Harvard. I received a world-class liberal arts education as an undergraduate at Santa Clara University.
This entry was posted in Data on publishing, General information. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

HTML tags are not allowed.

92,964 Spambots Blocked by Simple Comments