Searching R Packages


 * This article (a) summarizes what the authors know of available search capabilities for R (programming language) and (b) invites readers to contribute ideas for improvement. It is placed on Wikiversity and listed currently as a “research project” to encourage a wide discussion of the issues it raises moderated by the Wikimedia rules that invite contributors to “be bold but not reckless,” contributing revisions written from a neutral point of view, citing credible sources -- and raising other questions and concerns on the associated '“Discuss”' page. Your contribution(s) to this article may help transform it from a dream into a very useful reality.


 * initial draft by Spencer Graves with help from John Nash and Julia Silge

As of 2022-04-14, there were 19,029 active packages on the Comprehensive R Archive Network (CRAN). On July 7, 2017, John Nash noted “There are now over 9000 packages on CRAN, with many more in Bioconductor, on Github, and other repositories. How can or should R users navigate this large and unruly collection of packages to find the tools they need and use them effectively?”

Almost eight years earlier, I had published the “sos” package that allowed users to search for packages, not just help pages as had been available with the previous RSiteSearch{utils} function.

However, “sos” is still a command line solution and has largely been replaced by newer tools like the CRANsearcher addin for RStudio, crantastic, and RDocumentation. Only 2.2 percent of respondents in Julia Silge's recent survey said they used “R packages built for search such as the sos package.” Some “sos” features could be improved, but R users might benefit more from using that effort to improve more popular search capabilities.

Summary table of search capabilities devoted to R
The following table summarizes our understanding of search capabilities devoted to R. The "base::readLines, vkR::getURLs" column summarizes the results of searching for those two terms in the existing search alternatives. The benchmarking done here suggests a strong preference for RDocumentation for most web-based searches, followed by Rseek. The sos package can create an Excel workbook with summary results by package. However, Jonathan Baron plans to stop maintaining his "RSiteSearch" database next year, because other options are better. This will also obsolete the RSiteSearch{utils} function and the sos package unless someone else decides to modify them to use one of the existing databases, e.g., RDocumentation.

Questions to be considered in a proposal for improvement
Key questions from this comparison:
 * Might it be worth the effort to build a common database and search engine used by all with different defaults and options tailorable by users? The R Foundation might fund something like this if the concept were sufficiently well defined and compelling.
 * One of the simplest parts of such a system might be to share the user reviews between crantastic and CRANsearcher.
 * What might people want done with download statistics?
 * Task Views might include user ratings and download statistics.
 * Data on actual usage could be obtained from users who explicitly agree to having R monitor their usage of different packages. This facility could document which packages were tried, how long each was used, what errors were reported, and what package was used next.  Data like these could be used to identify users switching between different related packages.  For example, I recently tried gnumeric and quickly switched to readODS when I found that gnumeric required other software that I did not seem to have.  Maintainers of gnumeric and readODS might be able to use information like this to improve both packages.  Analyses of such data might be portrayed in a network diagram.  Such a system would also allow users to turn this feature off and on at any time.

These notes are being published on Wikiversity to invite anyone to add their own thoughts, either directly in this article or in the associated “Discuss” page.

Now it's your turn, dear reader: What would you like to see in a search capability for R?

These notes are posted on Wikiversity precisely to invite others to edit them directly or add comments on the associated “Discuss” page.

Acknowledgements
This discussion was inspired by the plenary session on "Navigating the R package universe" in the international useR!2017 conference in Brussels, Belgium, July 4-7, 2017.