Despite decades of research, the relationship between the quality of science and the value of inventions has remained unclear. We present the result of a large-scale matching exercise between 4.8 million patent families and 43 million publication records. We find a strong positive relationship between the quality of the scientific contributions referenced in patents and the value of the respective inventions. We rank patents by the quality of the science to which they are linked. Strikingly, high-ranking patents are twice as valuable as low-ranking patents, which, in turn, are about as valuable as patents without a direct science link. We show this core result for various science quality and patent value measures. The effect of science quality on patent value remains relevant even when science is linked indirectly through other patents. Our findings imply that what is considered excellent within the science sector also leads to outstanding outcomes in the technological and commercial realms.
The relationship between science and technology has been subject to intense discussions for centuries. Science was largely funded via patronage during the Renaissance, and separation of public funding for fundamental research and private industrial funding for applied research and commercial innovation efforts only emerged in the 19th century (1, 2). Since the aftermath of World War II, policymakers have relied on the notion that science helps to generate knowledge and information that ultimately contributes to the emergence of new technical and organizational capabilities, improvements in quality of life, and economic growth (3). Vannevar Bush’s vision of a publicly funded science system that feeds into privately organized innovation channels became the blueprint for most of the Western national systems of science funding, research and development, and innovation. This notion has recently come under scrutiny again, as voters have increasingly been demanding evidence on the benefits of science spending. For policymakers and scientists alike, it is tantamount to improve the understanding of the impact of science on technical progress and innovation.
The most pertinent form of output delivered by the science sector is publications, which are known to vary widely in quality. While some scientific publications will reach and inspire large numbers of researchers, others are never read or referenced. Measures of scientific quality, such as citation counts or impact factors, are used to make this heterogeneity visible and have become increasingly important in the governance of the science sector. Science governance and science funding seek to promote excellent over more mediocre science output by allocating resources to those researchers and institutions from which outstanding results can be expected.
However, it has been argued that this logic does not take tangible results from technology transfer and commercialization into account. Science is inward-looking, according to these voices. This raises the question as to what extent science output that is considered “excellent” within the science sector can lead to outstanding outcomes in the technological and commercial realms. This paper seeks to contribute new insights into the understanding of this nexus.
We provide evidence that the quality of scientific publications—as commonly assessed in science via citations—is a strong predictor of their relevance for and impact on technology development as documented in patents. We document two main results. First, publications with high scientific quality are vastly more likely to be cited in patent documents and at a higher rate. This confirms the baseline results of previous research going back to Hicks et al. (4) on a substantially larger and more diverse dataset. Second, the value of patents that directly build on science increases monotonically with science quality. These results hold across scientific disciplines, technological areas, and time. Ahmadpoor and Jones (5) recently established that patents more closely related to science are more valuable. We confirm that closeness to science matters; however, this relationship is largely driven by the actual science quality. Considering both dimensions together provides the most comprehensive view of the science quality–patent value relationship.
Our analysis starts from the universe of scientific publications in Web of Science (WoS) from the year 1980 onward, corresponding to approximately 43 million scientific publications. In terms of patents, we consider a sample of more than 4.8 million patent families, comprising all patent families from the database DOCDB with at least one grant publication at the European Patent Office (EPO) or the U.S. Patent and Trademark Office (USPTO), with first filing date between 1985 and 2012 included. Subsequently, our unit of analysis is the patent family, to which we also interchangeably refer as “patents.” The patents protect inventions in developed countries with more than 1 billion inhabitants in total.
Patents reference various types of documents that relate to the protected invention by either determining novelty (prior art) or explaining the content of the underlying invention. These documents listed on the patent’s front page or in so-called search reports include not only other patents foremost but also frequently nonpatent literature (NPL) (6). A subset of the latter are references to scientific articles, which we dub scientific NPL (SNPL).
To link patents to publications, we leverage a highly precise and comprehensive match of NPL references in patents with scientific publications in WoS. The NPL references in patents that were successfully linked to scientific publications comprise our set of SNPL references. Around 0.9 million patents were linked to at least one scientific publication via a total of about 7.0 million SNPL references. Of all scientific publications, about 2.2 million figure in this list of SNPL references.
In our core set of analyses, we rely on established measures of scientific quality and patent value. The quality of scientific publications is measured by the number of citations from other scientific publications over a period of 3 years since publication. We define a patent’s SNPL science quality as the quality of the patent’s SNPL references. A patent can reference zero, one, or several scientific articles in the same way that a scientific article can be referenced by zero, one, or many patents. Figure 1 illustrates this setup. When more than one SNPL reference is present, we consider by default only the publication of the highest quality. Patent value is measured by the number of forward patent citations over a period of 5 years from the patent’s first filing date. We use citations by U.S. patents as our first measure of patent value. Our results are robust to alternative choices. We replace citations as science quality measure with the journal impact factor. We replace our aggregation method of the quality of multiple SNPL references with several other options. We replace U.S. patent citations as value measures with a host of alternatives. The Supplementary Materials provide further detailed information on data sources, discuss the use of citations as indicators of relatedness between technology and science, and elaborate on alternative measures of patent value and scientific quality that we use for robustness analyses.
We first explore the selection of scientific publications into the patent realm, i.e., the relationship between science quality and the likelihood that a scientific publication is referenced in a patent. We look at the probability and intensity of referencing, i.e., if any and how many patented inventions refer to a given scientific contribution. We present results for publications below the median (all receiving zero science citations), for publications between the median and the 70th percentile, and the 80, 90, 95, 99 (top 1%), 99.9 (top 1 permille), and 99.99 (top 1 permyriad) percentiles of scientific quality. Figure 2 presents these results; the line plots the share of scientific publications appearing as SNPL references in at least one patent, and the size of the circles indicates the average number of times they appear as SNPL references.
Science quality is the 3-year citation count from other scientific publications. The patent count is not conditional on appearing as an SNPL reference. Blue shaded areas show 95% confidence intervals around the mean. N = 42,962,463.
We find a remarkably strong positive selection of scientific publications of high scientific quality into SNPL references. Below the median, scientific publications are almost never SNPL references. This number increases up to 40% at the top 1% of publications by scientific quality. A staggering majority of publications at the top 1 permille (>60%) and beyond the top 1 permyriad (80%) are referenced in the patents. The average number of times they appear as SNPL references in distinct patent families is 8.1 and 23.36, respectively. We emphasize that these results are not due to feedback from important patents to citations of the underlying science. By restricting our measure for scientific citations to the first 3 years after publication, we effectively exclude this bias.
In our main analysis, we investigate the extent to which SNPL science quality is a predictor of patent value. The main figures account for level differences across technology fields and over time: We estimate econometric models that absorb variation across these dimensions with pair-level fixed-effect (FE) controls and graphically present the resulting residual values. In effect, we transform deviations from the technology field and year-specific mean to deviations from the overall mean. In this way, the main results we present graphically account for structural changes over time across technological areas and constitute a baseline correlation with an immediate interpretation.
The relationship between SNPL science quality and patent value is depicted in Fig. 3A. We plot the average patent value across the distribution of SNPL science quality. As a first measure of patent value, we use the number of patent citations from U.S. patents. Later on, we consider alternatives. As a benchmark level, the figure shows the average value of patents without any SNPL reference (dashed line). We contrast two possible aggregation methods of SNPL science quality. When a patent references multiple scientific articles, we use, in a first variant, the highest-quality reference (orange). Here, we juxtapose a second variant where we consider the average quality of all references. Top science matters much more, considering that scientific material beyond the highest-quality reference dilutes the science quality–technology value relationship. In the Supplementary Materials, we show that this extends to other aggregation methods that focus on the top of the quality distribution. Consequently, we continue by only considering the highest-quality SNPL reference.
SNPL science quality is the maximum 3-year citation count across scientific publications appearing as SNPL references in a patent. Patent value is measured as the 5-year count of patent forward citations by U.S. patents. Patent value and science quality are residualized using technology field × first filing year FEs. Shaded areas show 95% confidence intervals around the respective means. (A) When there are multiple patent-paper references, we, by default, use the highest-quality reference (orange). In comparison, we use the average quality (blue). (B) SNPL self-references of the highest-quality SNPL references are considered. (C) Time distance is measured as the lag between the first filing year of the patent and the publication year of the scientific publication in SNPL references with the highest science quality. N = 4,767,844 patents (948,006 with SNPL references).
Previous studies have shown that SNPL references or references to other technical literature are associated with higher-value patents (5, 7, 8). We are able to confirm this finding in our data: The value of patents with SNPL references is higher than or equal to that of patents without SNPL references for any level of SNPL science quality, except the very bottom.
Notably, SNPL science quality fully explains the difference in average value between patents with and without SNPL references. Patent value increases rapidly, and almost monotonically, for a higher level of SNPL science quality. Patents with SNPL references at the bottom of the SNPL science quality distribution are, on average, as valuable as patents without SNPL references. Compared to this group, patents at the top of the SNPL science quality distribution receive more than twice as many forward patent citations. This core result suggests that scientific activities of high quality may lead to the development of highly valuable technologies.
Sometimes, high-quality research and technology development are undertaken by the same individuals or organizations, which may drive the result. Inventors and scientists can perform scientific activities that may lead directly to both scientific and technological outcomes (9). Therefore, we complement this finding by exploring how our results vary when considering separately SNPL self-references, whether at the author or institutional level. Figure 3B describes the corresponding results. The line in orange indicates the patent value of patents with SNPL self-references, i.e., those that overlap at the individual or institutional level. The line in blue describes the value of patents excluding SNPL self-references. The latter presents close to identical results to those obtained in Fig. 3A. Note that for part of the SNPL science quality distribution, with the exception of the very top, patent value is higher when patents with SNPL self-references are excluded. The share of SNPL self-references is roughly similar and, if anything, tends to decrease with higher levels of SNPL science quality. Overall, this is supportive of the idea that high-quality science is linked to high-value technology especially when science and technology are produced by different individuals or organizations.
Our analysis, so far, has focused on patents at the frontier with science, i.e., linked directly to a scientific publication via an SNPL reference. To generalize our findings, we also consider patents connected to scientific publications indirectly via references to other patents. Patents for which the shortest path in the citation network is longer are said to be more distant from the science-technology frontier. Recent studies have used this concept of distance between science and technology and demonstrate that the value of patents monotonically decreases with greater distances from the science frontier (5). In Fig. 4, we consider this dimension and describe the value of patents at different levels of distance from the science-technology frontier. We distinguish patents linked (directly or indirectly) to SNPL references at the top 10% and bottom 10% of quality. We also report the average value of all patents at different distances. Patents linked to more than one patent with SNPL references at the same distance are assigned to the patent with the highest-quality SNPL reference.