Evaluating the Claim that Preregistration and Registered Reports Restrict Exploratory Research

Moin Syed¹, Caroline H Armstrong¹, Emily J Chan¹, Abby Person¹

¹ Department of Psychology, University of Minnesota, Minnesota, USA

Originally published on December 12, 2025 at:

https://doi.org/10.71240/lcyc.227818

Abstract

A persistent concern about implementation of preregistration and Registered Reports in psychology is that doing so would reduce the frequency and value of exploratory research, and therefore restrict creativity, serendipity, and discovery. As we are nearly 15 years on from the initial proposal to adopt registration in psychology, it seems time to formally examine whether these concerns have any merit. The purpose of the present study is to find out. In this proposed project, we will examine a matched set of Registered Reports, preregistered articles, and traditional articles (total N = 300) for the frequency of reported exploratory research, defined as any analysis that is indicated to not be a planned test of a specific hypothesis. This project will provide strong data that should provide an empirical basis to be used in future discussion about the impact of registration on exploratory research.

Full text

Psychology’s replication crisis has led to a host of new policies and practices meant to improve the state of our science (Munafò, 2017). A practice that is central to the reform movement is preregistration, which involves creating an accessible and unalterable record of a study plan prior to collecting data and/or conducting analyses (Nosek et al., 2018). Rooted in the clinical trial registry previously developed for medical research (Boutron et al., 2016), preregistration is a practice that supports transparency about researchers’ plans, and facilitates comparisons between how those plans correspond to what they actually report (Claesen et al., 2021; Willroth & Atherton, 2024). An early and persistent concern about widescale implementation of preregistration in psychology is that doing so would reduce the frequency and value of exploratory work, and therefore restrict creativity, serendipity, and discovery (Goldin-Meadow, 2016; Scott, 2013). As we are nearly 15 years on from the initial proposal to adopt preregistration in psychology, it seems time to formally examine whether these concerns have any merit. The purpose of the present study is to find out.

Fallout from the Confirmatory/Exploratory Divide

The function that preregistration is meant to serve is not always clear or consistent (Lakens, 2019). Resolving debates about the function of preregistration is outside of the scope of the current project, as are debates about the utility of preregistration (see Lakens, 2019; Syed, 2024; Szollosi et al., 2020 for discussion of these issues). An early prominent argument for preregistration is that it allows for a clearer separation between confirmatory analyses, or hypotheses generated prior to the study, and exploratory analyses, or analyses for which there were no prior hypotheses (Wagenmakers et al., 2012). This framing was in direct response to concerns about questionable research practices (or p-hacking, researcher degrees of freedom, garden of forking paths), in which researchers fail to disclose all of the analyses that were conducted other than the statistically significant results presented in their papers, as well as HARKing (hypothesizing after results are known; Kerr, 1998), in which researchers reframe exploratory findings as though they were confirmatory.

The distinction between confirmatory and exploratory is not always so clear in practice (Szollosi & Donkin, 2021; Wagenmakers et al., 2012). Part of the problem is that there is not a clear or consistent definition of what constitutes exploratory research. Indeed, this is an active area of discussion (Feest & Devezer, 2025), a point we are setting aside due to our focus on whether researchers claim they are doing confirmatory or exploratory work in the context of preregistration. For our purposes, we define exploratory as, “any analysis that is indicated to not be a planned test of a specific hypothesis,” which is how the term is generally used in the debate about preregistration. The lack of clarity of terms is related to the fact that confirmatory/exploratory forms a continuum of potential data analytic approaches (Wagenmakers et al., 2012). Accordingly, many have moved away from emphasizing the distinction in the context of preregistration in favor of a distinction between data-independent and data-dependent decision-making (e.g., Srivastava, 2018). Nevertheless, the confirmatory/exploratory distinction was prominent as preregistration was gaining momentum. Despite proponents of preregistration arguing to the contrary (Chambers, 2013; Wagenmakers et al., 2012), there were quickly concerns expressed that preregistration would both restrict and devalue exploratory research. The message about the utility of preregistration was seemingly taken by many to mean that only confirmatory research via preregistration was research worth doing.

A similar concern has been expressed about Registered Reports (see Besançon et al., 2021). Registered Reports are a more restrictive version of preregistration that embeds the practice into the process of peer review. The review process for Registered Reports is separated into two distinct sections: Stage 1, consisting of the Introduction, Method, and Planned Analyses, which are prepared prior to executing the study; and Stage 2, consisting of the previous sections plus the Results and Discussion, submitted for review following completion of the study. Review platforms¹ make acceptance decisions based on the Stage 1 manuscript and review process, addressing the question: will this study produce worthwhile knowledge, regardless of the results? If the answer is yes, then the review platform agrees to accept the full Stage 2 once complete, regardless of the results. Registered Reports are intended to address the problem of publication bias, or the fact that the published literature does not adequately represent the universe of studies completed, but rather studies have been selected via a biased process (most typically in favor of statistically significant findings). Addressing the problem of publication bias reduces the motivation to engage in questionable research practices, HARKing, and other unscientific behaviors (Chambers, 2013).

Registered Reports make a very clear distinction between data-independent and data-dependent decision making, which loosely align with confirmatory and exploratory analyses, because anything not specified as hypothesized in the Stage 1 manuscript is considered to be exploratory. Importantly, there is absolutely no restriction on including exploratory analyses within Registered Reports. Researchers can include planned exploratory analyses in their Stage 1 proposal, and are free to add whatever they wish at Stage 2, within reason, so long as they are clearer labeled as exploratory and are not given undue weight when interpreting the full findings of the study (Chambers & Tzavella, 2022).

Nevertheless, Registered Reports are confirmatory-focused, and indeed use language and procedures that are much more aligned with quantitative hypothesis-testing approaches compared to other approaches. That said, Registered Reports can and have been used with a wide variety of research designs, including qualitative studies (Karhulahti et al., 2023), secondary data (Davis-Kean et al., 2024), meta-analyses (Kathawalla & Syed, 2021), and many more. Concerns about the heavy focus on confirmatory approaches led to the development of a companion format, Exploratory Reports, wherein researchers largely eschew inferential statistics in favor of extensive descriptive statistics and data visualizations (McIntosh, 2017). Despite the fact that they were introduced many years ago, Exploratory Reports have been taken up by few review platforms and are seldom used.

Taken together, both preregistration and Registered Reports seek to improve the transparency and clarity with which researchers describe the epistemic status of their work. The intention is not to favor one form of inquiry over another, although we recognize that in practice this could very well be what happens. Rather, the intention is to support clearer delineation of confirmatory/planned and exploratory/unplanned analyses via a dedicated subsection of the Results section or clear and specific language that denotes an analysis as exploratory. Despite many proponents of preregistration being clear about this intention (Wagenmakers et al., 2012), concerns persist (McDermott, 2022). As an ostensibly empirical science, however, at some point concerns need to be adjudicated via data.

The only relevant data of which we are aware comes from a recent Ph.D. thesis from O’Mahony (2023). Embedded within a much larger examination of similarities and differences between Registered Reports (n = 170) and traditional articles (n = 340), the findings indicated that exploratory research was in fact more commonly identified in the Registered Reports (75%) compared with the traditional articles (57%). This finding runs counter to the many critics claiming that preregistration would reduce the frequency of exploratory research.

The Present Study

The purpose of the present study is to conduct an initial investigation into whether preregistration and Registered Reports restrict exploratory research, or any analysis that is indicated to not be a planned test of a specific hypothesis. We do so by comparing the prevalence of exploratory research across articles with preregistered studies, Registered Reports, and traditional articles (henceforth, we use the term “registration” to refer to both preregistered studies and Registered Reports). The purpose of the study is not to get a population estimate of exploratory research in psychology, but rather examine the specific claims that registration would lead to a decrease in exploratory research. The competing views on the practice of registration and its relation to exploratory research would lead to two divergent data patterns:

If registration restricts exploratory research, then traditional articles should include greater reports of exploratory research than registered studies (Figure 1a).

If registration clarifies the status of exploratory research, then, consistent with O’Mahony (2023), registered studies should include greater reports of exploratory research than traditional articles (Figure 1b).

The primary goal of the current project is to determine which of the aforementioned two data patterns is most consistent with the data. Our hypothesis is that the data will be more consistent with pattern 2, that is, we anticipate finding a greater frequency of exploratory research in registered articles compared to traditional articles.

Importantly, the current study will not be able to actually assess the frequency of exploratory research, per se, but rather the frequency with which exploratory research is reported and transparently described. It is for this reason that we hypothesize that registered articles will report more exploratory research than traditional articles. This line of argument also suggests that there could be differences between preregistered articles and Registered Reports. Several studies have reported that researchers often do not adhere to their preregistration plans nor do they transparently disclose deviations from their plans (Claesen et al., 2021; van den Akker et al., 2023; Willroth & Atherton, 2024). Thus, an article could be preregistered but essentially indistinguishable from a traditional article in its reporting. Registered Reports, however, are much more regulated given how the review process operates, and should thus have clear and transparent reporting. Accordingly, we hypothesize that there will be a greater frequency of exploratory research reported in Registered Reports compared with preregistered and traditional articles.

We additionally take on some exploratory analyses, for which we do not seek to test any specific hypothesis, to examine whether reporting practices have changed over time. Although exploratory, our interest in this question is sparked by several possibilities that we can imagine. Two speculations about change over time include:

Because of the heightened interest in transparent reporting over the last 15 years, we may see an increase in traditional articles labeling analyses as exploratory. (Figure 1c). Including an analysis of traditional articles prior to the implementation of registration in psychology would address this point.
The early days of registration in psychology may have been associated with less than ideal implementation by authors, reviewers, and editors, due to the fact that it was a new and unfamiliar practice. Accordingly, some may have had more rigid views about what was permissible to include in registered studies. Thus, there could be less exploratory research in registered studies when the practices were initially adopted but have subsequently increased over time (Figure 1d).

Figure 1. Four hypothetical data patterns.

Method

Article Selection Process

The corpus of articles for analysis will consist of three types of articles, defined as follows:

Articles reporting a preregistered study (“preregistered articles”): An empirical article that includes at least one study, or set of analyses, that is a) stated to be preregistered and b) includes an accessible link to the registration. For multi-study articles, only one of the reported studies needs to be preregistered to be included in this category.

Articles reporting a Registered Report (“Registered Reports”): An empirical article that includes at least one study that is a) stated, or clearly implied, to be a Registered Report and b) for which the accepted Stage 1 manuscript can be located (either via an included link in the published Stage 2 manuscript or via manual search). For multi-study articles, only one of the reported studies needs to be a Registered Report to be included in this category.

Articles with no reported registrations (“traditional articles”): An empirical article that includes no statements or link regarding any form of registration. This group of articles will consist of two sub-groups: a group of articles that is matched to the target registered articles and a second group of articles published in 2010, prior to the adoption of registration in psychology. The year 2010 was selected because it immediately predates major awareness of the replication crisis in psychology, and registration had not yet been introduced to psychology.

For all types of articles, any type of empirical study will be included, whether quantitative, qualitative, mixed methods, or meta-analyses.

Most of the meta-scientific research on preregistration has relied on one of two datasets: articles that were 1) part of the Center for Open Science Preregistration Challenge or 2) included in a public database listing articles that received a preregistration badge. Both are limited: the Preregistration Challenge was a unique experience with specific guidelines and a review process by the Center for Open Science. The badge database includes articles published only up to 2020. Neither is specific to psychology.

The reason these limited resources have been used is that identifying a proper set of articles is no easy task. We adopt an approach that we believe is well-suited to our research questions.

We will first start with journals that are documented to publish Registered Reports. These journals are a good starting point, because publishing Registered Reports is a strong signal that the journal values registration for empirical studies. Accordingly, if the criticism that registration restricts exploratory work is true, it should be most evident in the articles published in these journals.

Liu et al. (2025) recently published a metascientific study of Registered Reports in psychology that serves as a useful starting point. Through their search process, they identified 47 journals that publish Registered Reports and were classified as psychology journals by the 2023 Clarivate Journal Citation Report (see list at https://osf.io/jwaf9).. This is not an exhaustive list, as some journals that publish the format were not included². This is not a problem for the current study, as we are not seeking to generate population estimates of how often exploratory work is published in the literature, but rather the relative prevalence of exploratory work in different article types.

Within these journals, they identified 237 Registered Reports published up to April 1, 2023, with the first published Registered Report appearing in Cortex on February 13, 2015³. Liu et al. (2025) excluded replication studies from their set, as well as articles for which they could not identify a match and one article that had been retracted, bringing the total for analysis to 119. We also chose to exclude replication studies from the current project, as they have a different function from original research and thus could show disparate patterns.

The articles identified by Liu et al. (2025) will be our initial set (see list at https://osf.io/xmdv7),, which will be augmented by a manual search of the table of contents of the 47 journals for additional Registered Reports published between April 1, 2023 and April 1, 2026. Any additional Registered Reports identified will be added to the Liu et al. set. All articles included in this set will be examined by the authorship team to ensure they meet the criteria of a Registered Report given previously.

From this final set, which should consist of at least 200 articles, we will randomly select 75 articles for further analysis. Using these 75, we will locate companion articles for the 75 preregistered and 150 traditional articles (75 of each group). The total sample of articles will be 300 across the four types, which is informed by our power analysis (described in the subsequent section).

Companion articles will be selected based on the following criteria, modified from the criteria used by O’Mahony (2023).

Journal – the companion articles must be selected from the same journal as the Registered Report.

Timeframe – The companion articles should be selected from the same issue as the Registered Report. If no companion article is available in the same issue, we will examine the prior issue. If none is available in the prior issue, we will examine the subsequent issue. We will continue this process until a companion is identified. Timeframe of the selected article will be coded as 1 – same issue, 2 – within two issues on either side, 3 – within four issues on either side, or 4 – more than five issues apart.

Topic – The companion articles should generally be the same substantive topic of the Registered Report, broadly conceived. Topic will be coded as 1 – very similar (e.g., both focused on factors related to depression), 2 – somewhat similar (e.g., both focused on psychopathology), or 3 – not similar (e.g., both are related to clinical psychology). Ideally, all included articles would have a rating of 1 or 2. However, Topic is of secondary importance to Timeframe , so a rating of 3 for Topic will be acceptable if it allows Timeframe to remain 2 or lower.

Design – The companion articles should generally be the same design, for example experimental, intervention, observational, qualitative, or meta-analysis. Design will be coded as 1 – very similar, 2 – somewhat similar, or 3 – not similar. Ideally, all included articles would have a rating of 1 or 2. However, Design is of secondary importance to Timeframe , so a rating of 3 for Design will be acceptable if it allows Timeframe to remain 2 or lower.

The additional group of 75 traditional articles published in 2010 will be selected from the same journals as the Registered Reports. Starting with the first issue published in 2010, we will identify an empirical article that matches the previously selected companion traditional articles based on Topic and Design, using the same criteria described above. The selected articles must have a score of at least 2 for each of Topic and Design to be included.

Power Analysis

Full details of the power analysis are available at https://osf.io/n9dr6; here, we provide a summary. The arguments made in the literature that registration will restrict exploratory research has often used extreme language such as that it will “put researchers in chains” (Scott, 2013), “stifle discovery” (Goldin-Meadow, 2016) and serves as a “stranglehold” on research (McDermott, 2022). The severity of this language clearly implies a large effect, that we would see a dramatically lower rate of exploratory work in registered compared to non-registered studies. The only available evidence, however, indicates a moderate difference in the other direction, with a greater prevalence of exploratory research in Registered Reports compared with standard reports (O’Mahony, 2023). The effect size in O’Mahony (2023) is not reported, but we can use the reported test statistic, ꭕ²(1, N = 510) = 16.19, p < 0.001, to calculate it as ɸ =.18. This corresponded to an approximate difference in prevalence of 20%.

It is difficult to say what effect size is the smallest that would be meaningful. Our goal in this project, however, is to address the extreme claims about the damaging impact of registration. Thus, absent any evidence, we would thus primarily be interested in large effects. O’Mahony (2023) provides evidence of a moderate effect, albeit in the other direction. That this effect was found in a registered thesis using a large sample of articles suggests that there is low risk of bias. Thus, we used ɸ =.18 as our estimate for the present study. The pwr.chisq.test function from the pwr package in R (Champely et al., 2020) indicated a necessary total sample size of 304 for a 2×2 contingency table (registered vs. not registered), with alpha =.05 and power =.80. That assumes that we are collapsing the two registered articles types together and two traditional articles types together and comparing them against each other. If, instead, we compared all four articles types, but assumed that the pattern of exploratory research would be the same across the two broad categories, then that would yield an effect size of Cramer’s V = 0.21 and a suggested sample size of 216. Relaxing these assumptions in different ways leads to a range of effect sizes from Cramer’s V = 0.18–0.28 and a range of sample sizes from N = 120–306. Taken together, the power analysis indicates a total sample of 300 articles will be sufficient for the aims of the study.

Article Coding

Coding will be done at the article-level rather than individual studies within articles. That is, assessment of our target categories described below will be made based on their appearance in any aspect of the article and not related to a specific study. An argument against this approach may be that some articles consist of a mix of registered and non-registered studies, and therefore we should only code the registered studies for the presence of exploratory analyses. We do not take this approach because, if at the article level researchers used a mix of registered and non-registered studies in order to include both confirmatory and exploratory analyses, then this approach would be entirely consistent with the argument of registration advocates that registration does not inhibit exploratory research. Thus, coding at the article level provides the clearest indication of the use of exploratory research in the context of registration.

Importantly, in the current project we remain agnostic about the quality of the registrations themselves. There have been several studies indicating that registrations lack specificity and that researchers often engage in undisclosed deviations from their registrations (Claesen et al., 2021; van den Akker et al., 2023). These are important issues but not our concern here, as we are focused on the implications of the practice of registration regardless of how that practice is implemented.

Each article will be coded for the following:

Statements of Exploratory Research

One of the limits of the O’Mahony (2023) study is that it did not include a clear definition of what constitutes “exploratory.” In the current project, the presence of exploratory research will be coded in accordance with our definition, “any analysis that is indicated to not be a planned test of a specific hypothesis.” We will code for any explicit or clearly implied mention of exploratory tests, exploratory analysis, unplanned analysis, or speculative analyses. “Clearly implied” is a subjective criterion but is included here for cases where raters can determine that the analyses were exploratory absent an explicit statement. For example, if researchers state a research question, but indicate that they made no hypotheses about that question, then that would be considered exploratory even if they did not explicitly label it as such. Additionally, any tests that are labeled as robustness tests, sensitivity tests, or tests of alternative explanations, and are explicitly or clearly implied to be exploratory or unplanned, will be coded as exploratory research.

Importantly, not all uses of the word “explore” or “exploratory” necessarily indicate exploratory research. For example, following a hypothesized statistically significant interaction effect, authors may state that they “explored” or “probed” the interaction via simple slopes analysis. That kind of follow-up test is directly related to the hypothesis about the interaction and is conducted to gain greater clarity on the specific nature of the interaction. It also reflects the broader problem in psychology with the lack of specificity when proposing hypotheses about interaction effects (Baranger et al., 2023). An additional case that does not necessarily meet the criteria for exploratory research is the use of “exploratory factor analysis” as an analytic technique. This type of analysis is sometimes used in an exploratory way that is consistent with our definition, and sometimes used to test confirmatory hypotheses (despite its name). In both of these cases, the type of analysis could reflect exploratory research by our definition, but it will depend on the specific use in that study. Thus, the context will be taken into account when coding rather than simply examining the articles for keywords..

Three sections of each article will be inspected and coded for the presence of exploratory research: 1) the end of the Introduction section, where the researchers discuss the aims of the study; 2) the Data Analysis Plan/Results⁴ sections(s), and 3) the Discussion section. For each section, raters will code for the presence or absence of mentions of exploratory research. Although O’Mahony (2023) also included an “uncertain” category, any article that would potentially be coded as uncertain would not meet our coding criteria and thus should be coded as absent.

Section Headers Indicating Exploratory Research

Raters will code presence/absence for the inclusion of section headers that explicitly indicate exploratory analyses. The full articles will be examined, but these are most likely to appear in the Results section when present.

Coding Difficulty

Raters will code for how difficult it was to determine their rating, which also can serve as a useful indicator of their confidence in their ratings. We will use the system from O’Mahony (2023): 1 – Very easy, 2 = Somewhat easy, 3 = Somewhat difficult, and 4 = Very difficult. Raters will provide a global rating for difficulty, rather than for each section.

Rater Training and Reliability

The raters for this project will consist of the four authors, a professor of psychology and three doctoral students in social psychology. We conducted a pilot coding process of six articles based on the preliminary coding categories. This pilot process indicated that the coding system was sufficiently clear and applicable, and led to some minor refinements in the definition and scope of the categories.

Formal training and establishing reliability will be done following the review process of this proposal, given that the coding system may change. The training phase will use articles that are not part of the focal set but share similar characteristics (i.e., articles of the three types that were published in psychology journals across the same time range). We will draw from the following journals that were not included in the focal set: Assessment, Clinical Psychological Science, Journal of Personality, PloS ONE, and Social Psychological Bulletin.

The training phase will consist of four steps:

Step 1 will be for the rating team to become familiar with the coding system and how to apply it to the articles. We will independently code two articles during a project meeting and then have a discussion about our coding system.

Step 2 will involve the raters independently coding a set of five articles. This coding will be subject to reliability analysis using average pairwise percent agreement and Fleiss’s Kappa to quantify the degree of agreement. The raters will then have a meeting in which they discuss the disagreements.

Step 3 will be a repeat of Step 2 but with 10 new articles. Once again, average pairwise percent agreement and Fleiss’s Kappa will be used to quantify the degree of agreement. If percent agreement exceeds.80 and Kappa exceeds.60, then the training phase will be complete. If not, then we will continue the training phase with a new set of 10 articles until these thresholds are reached.

The focal set of 300 articles will be coded by the four raters, arranged in two teams of two raters each. Each team will code 100 unique articles, with the remaining 100 articles being coded by all four raters. The coding will proceed across 8 weeks, with the coding for each week alternating between the 25 common articles and 50 unique articles. Percent agreement will be assessed for each batch, and the raters for that batch will meet to resolve discrepancies via discussion. All data and code will be made openly available at https://osf.io/4vbtq/.

Planned Analysis

Hypothesis Tests

To test our first hypothesis, that there will be a greater frequency of exploratory research in registered articles compared to traditional articles, we will take an average of the exploratory research ratings (across Introduction, Results, and Discussion sections) and compare those averages between registered and traditional articles. This will be done using a 2×2 chi-square test of independence (alpha =.05).

A secondary test of the hypothesis will compare whether or not there is an explicit header for exploratory research, again comparing registered and traditional articles using a 2×2 chi-square test of independence (alpha =.05).

Exploratory follow-up tests will examine variations in the specific sections (e.g., Introduction vs. Results sections), but no inferential tests will be used and no p-values will be reported for these analyses.

To test the second hypothesis, that there will be a greater frequency of exploratory research reported in Registered Reports compared with preregistered and traditional articles, we will conduct a 3×3 chi-square test of independence (alpha =.05) comparing the average exploratory research rating across Registered Reports, preregistered studies, and traditional articles. Follow-up analyses will examine the cell-wise adjusted standardized residual, with inference about deviations based on exceeding the threshold of +/- 1.96.

A secondary test of the hypothesis will compare whether or not there is an explicit header for exploratory research, again comparing Registered Reports, preregistered studies, and traditional articles using a 3×3 chi-square test of independence (alpha =.05). Follow-up analyses will examine the cell-wise adjusted standardized residual, with inference about deviations based on exceeding the threshold of +/- 1.96.

Exploratory Analyses

We will conduct extensive exploratory analyses of the data, focusing on generating plots and reporting effect sizes. No inferential statistics will be used and no p-values will be reported. One set of planned exploratory analyses is to examine change over time in reports of exploratory research, separated by article type. We will additionally examine variations in coding difficulty by article type, and whether this has changed over time.

References

Baranger, D. A. A., Finsaas, M. C., Goldstein, B. L., Vize, C. E., Lynam, D. R., & Olino, T. M. (2023). Tutorial: Power Analyses for Interaction Effects in Cross-Sectional Regressions. Advances in Methods and Practices in Psychological Science, 6(3), 25152459231187531. https://doi.org/10.1177/25152459231187531

Besançon, L., Bezerianos, A., Dragicevic, P., Isenberg, P., & Jansen, Y. (2021). Publishing Visualization Studies as Registered Reports: Expected Benefits and Researchers’ Attitudes. https://doi.org/10.31219/osf.io/3z7kx

Bottesini, J. G., Chambers, C. D., Howard, A., Lucas, R. E., Moore, D. A., Sbarra, D., Tackett, J. L., & Syed, M. (2025). Peer Community in Psychology: A platform for peer review of preprints across psychology. PsyArXiv. https://doi.org/10.31234/osf.io/m456e_v1

Boutron, I., Dechartres, A., Baron, G., Li, J., & Ravaud, P. (2016). Sharing of Data From Industry-Funded Registered Clinical Trials. JAMA, 315(24), 2729–2730. https://doi.org/10.1001/jama.2016.6310

Chambers, C. D. (2013). Registered Reports: A new publishing initiative at Cortex. Cortex, 49(3), 609–610. https://doi.org/10.1016/j.cortex.2012.12.016

Chambers, C. D., & Tzavella, L. (2022). The past, present and future of Registered Reports. Nature Human Behaviour, 6, 29–42. https://doi.org/10.1038/s41562–021–01193–7

Champely, S., Ekstrom, C., Dalgaard, P., Gill, J., Weibelzahl, S., Anandkumar, A., Ford, C., Volcic, R., & Rosario, H. D. (2020). pwr: Basic Functions for Power Analysis (Version 1.3–0) [Computer software]. https://cran.r-project.org/web/packages/pwr/

Claesen, A., Gomes, S., Tuerlinckx, F., & Vanpaemel, W. (2021). Comparing dream to reality: An assessment of adherence of the first generation of preregistered studies. Royal Society Open Science, 8(10), 211037. https://doi.org/10.1098/rsos.211037

Davis-Kean, P. E., Ellis, A., & Syed, M. (2024). Registered Reports with secondary developmental data: Introduction to the special issue. Infant and Child Development, 33(2), e2506. https://doi.org/10.1002/icd.2506

Feest, U., & Devezer, B. (2025, January). Toward a More Accurate Notion of Exploratory Research (And Why it Matters) [Preprint]. https://philsci-archive.pitt.edu/24482/

Goldin-Meadow, S. (2016). Why Preregistration Makes Me Nervous. APS Observer, 29. https://www.psychologicalscience.org/observer/why-preregistration-makes-me-nervous

Karhulahti, V.-M., Branney, P., Siutila, M., & Syed, M. (2023). A primer for choosing, designing and evaluating registered reports for qualitative methods. Open Research Europe, 3, 22. https://doi.org/10.12688/openreseurope.15532.2

Kathawalla, U.-K., & Syed, M. (2021). Discrimination, Life Stress, and Mental Health Among Muslims: A Preregistered Systematic Review and Meta-Analysis. Collabra: Psychology, 7(1), 28248. https://doi.org/10.1525/collabra.28248

Kerr, N. L. (1998). HARKing: Hypothesizing After the Results are Known. Personality and Social Psychology Review, 2(3), 196–217. https://doi.org/10.1207/s15327957pspr0203_4

Lakens, D. (2019). The Value of Preregistration for Psychological Science: A Conceptual Analysis [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/jbh4w

Liu, Z., Wang, X. T. (XiaoTian), Wang, Z., Yan, W., & Hu, M. (2025). Registered reports in psychology across scholarly citations and public dissemination: A comparative metaevaluation of more than a decade of practice. American Psychologist. https://doi.org/10.1037/amp0001503

McDermott, R. (2022). Breaking free: How preregistration hurts scholars and science. Politics and the Life Sciences, 41(1), 55–59. https://doi.org/10.1017/pls.2022.4

McIntosh, R. D. (2017). Exploratory reports: A new article type for Cortex. Cortex, 96, A1–A4. https://doi.org/10.1016/j.cortex.2017.07.014

MetaROR. (n.d.). About MetaROR. MetaROR. Retrieved November 28, 2025, from https://metaror.org/about-metaror/

Munafò, M. R. (2017). Improving the Efficiency of Grant and Journal Peer Review: Registered Reports Funding. Nicotine & Tobacco Research, 19(7), 773–773. https://doi.org/10.1093/ntr/ntx081

Nosek, B. A., Ebersole, C. R., DeHaven, A. C., & Mellor, D. T. (2018). The preregistration revolution. Proceedings of the National Academy of Sciences, 115(11), 2600–2606. https://doi.org/10.1073/pnas.1708274114

Nosek, B. A., & Lakens, D. (2014). Registered reports: A method to increase the credibility of published results. Social Psychology, 45(3), 137–141. https://doi.org/10.1027/1864–9335/a000192

O’Mahony, A. (2023). Comparative Analysis of Registered Reports and Standard Research Literature final copy [Dissertation, Cardiff University]. https://orca.cardiff.ac.uk/id/eprint/167686

Sassenhagen, J., & Bornkessel-Schlesewsky, I. (2015). The P600 as a correlate of ventral attention network reorientation. Cortex, 66, A3–A20. https://doi.org/10.1016/j.cortex.2014.12.019

Scott, S. (2013, July 25). Pre-registration would put science in chains. Times Higher Education (THE). https://www.timeshighereducation.com/comment/opinion/pre-registration-would-put-science-in-chains/2005954.article

Srivastava, S. (2018). Sound Inference in Complicated Research: A Multi-Strategy Approach [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/bwr48

Syed, M. (2024). Three Persistent Myths about Open Science. Journal of Trial and Error. https://doi.org/10.36850/mr11

Szollosi, A., & Donkin, C. (2021). Arrested Theory Development: The Misguided Distinction Between Exploratory and Confirmatory Research. Perspectives on Psychological Science, 1745691620966796. https://doi.org/10.1177/1745691620966796

Szollosi, A., Kellen, D., Navarro, D. J., Shiffrin, R., van Rooij, I., Van Zandt, T., & Donkin, C. (2020). Is Preregistration Worthwhile? Trends in Cognitive Sciences, 24(2), 94–95. https://doi.org/10.1016/j.tics.2019.11.009

van den Akker, O. R., van Assen, M. A. L. M., Enting, M., de Jonge, M., Ong, H. H., Rüffer, F., Schoenmakers, M., Stoevenbelt, A. H., Wicherts, J. M., & Bakker, M. (2023). Selective Hypothesis Reporting in Psychology: Comparing Preregistrations and Corresponding Publications. Advances in Methods and Practices in Psychological Science, 6(3), 25152459231187988. https://doi.org/10.1177/25152459231187988

Wagenmakers, E.-J., Wetzels, R., Borsboom, D., van der Maas, H. L. J., & Kievit, R. A. (2012). An Agenda for Purely Confirmatory Research. Perspectives on Psychological Science, 7(6), 632–638. https://doi.org/10.1177/1745691612463078

Willroth, E. C., & Atherton, O. E. (2024). Best laid plans: A guide to reporting preregistration deviations. Advances in Methods and Practices in Psychological Science, 7(1), 25152459231213802. https://doi.org/10.1177/25152459231213802

Notes

^[1] We use the term “review platforms” instead of the more common “journals” in recognition of the growing prominence of journal-independent review platforms, such as PCI Psychology (Bottesini et al., 2025) and MetaROR (MetaROR, n.d.). Thus, “review platforms” is a more inclusive term for the current landscape of scientific dissemination.
^[2] e.g., Comprehensive Results in Social Psychology, likely excluded at least in part because it only publishes Registered Reports, and thus no companion traditional article could be included for comparison.
^[3] For accuracy, it is important to note that Sassenhagen and Bornkessel-Schlesewsky (2015) is the first Registered Report published in Cortex, a journal that pioneered the format, but that a special issue of the journal Social Psychology published in 2014 (Nosek & Lakens, 2014), one year earlier, consisted of 15 Registered Reports that were all replications (and thus were excluded from Liu et al. (2025).
^[4] Not all articles will include a Data Analysis Plan section, and some may include it as a subsection of the Results section, and so for that reason we are treating them as the same unit.

Declarations

Competing Interests

The authors declare that no conflicts of interest exist

Funding

No specific funding was received in support of this research.

Author Contributions

MS: Conceptualization, Data curation, Formal analysis, Methodology, Project administration, Supervision, Writing–original draft, Writing–review & editing; CA: Data curation, Formal analysis, Writing–review & editing; EC: Data curation, Formal analysis, Writing–review & editing; AP: Data curation, Formal analysis, Writing–review & editing;

Editors

Kathryn Zeiler
Editor-in-Chief

Olmo van den Akker
Handling Editor

Editorial assessment

by Olmo van den Akker

DOI: 10.70744/MetaROR.312.1.ea

I much appreciated the detailed explanations for the decisions made in the design, which made it much easier to arrive at an informed assessment. That level of detail also seems to have helped the reviewers provide the elaborate feedback they did.

Both reviewers agree that the manuscript takes up an important question, but both also think the current proposal does not yet support the claims it wants to make. In particular, the reviewers think that the study can speak to how exploratory analyses are reported or labeled in published papers, but not to whether preregistration and Registered Reports actually reduce exploratory research itself, or diminish creativity, novelty, and serendipity. They also raise related methodological concerns: the definition of exploratory research is still too loose for reliable coding, the coding plan and statistical analyses are not yet sufficiently specified, and some design choices (coding at the article rather than study level, ignoring differences in preregistration quality, and not accounting for journal-specific reporting requirements) make the eventual interpretation of the results difficult.

The reviewers are largely aligned on these central points. However, reviewer 1 puts more weight on the broader framing of the debate and on the possibility that the study is really picking up journal policy rather than research practice, while Reviewer 2 focuses more on the need for a tighter protocol and a clearer analysis plan. The manuscript is well written, but some of the language in the abstract and Methods section is imprecise, and that lack of precision contributes to the larger conceptual problem.

Recommendations from the editor

The main revision needed is a clearer match between the paper’s framing and what the proposed design can actually show. At the moment, the study seems able to examine the explicit reporting of exploratory analyses but not the actual frequency or value of exploratory research. The abstract and introduction should reflect that. Additionally, it should not be implied that the study will test whether preregistration restricts creativity, discovery, or serendipity unless those outcomes are going to be measured directly.

The manuscript would benefit from an explicit coding protocol that spells out how raters will identify exploratory analyses and what counts as sufficient evidence that an analysis was unplanned. It is difficult to assess whether this important assessment is valid based on the text in the manuscript alone. Both reviewers already indicated some concerns about this validity. Moreover, the coding plan seems to focus mainly on the Introduction, Results/Data Analysis, and Discussion, but in psychology papers, analytic decisions are often not presented in a separate data analysis section but folded into the Methods section. As a practical adjustment, I think it would make sense to also code those parts of the Methods section that describe the analysis plan, even when they are not explicitly labeled as such.

I would also encourage the authors to reconsider the decision to code at the article level rather than the study level, especially in multi-study papers that may mix preregistered and non-preregistered components. That choice currently weakens the inference the paper wants to make. As reviewer #2 suggests, it may even be informative to do a within-paper analysis, comparing preregistered and non-preregistered studies that way.

A final issue, not mentioned by the reviewers, is the sampling frame. Because the study starts from journals known to publish Registered Reports, the final sample may end up drawing disproportionately from journals that are already more committed to transparency reforms. That does not make the design inappropriate, but it does limit how broadly the findings can be interpreted. One possible way to reduce this problem would be to relax the requirement that companion articles come from the same journal as the Registered Report, although I recognize that doing so also has its downsides.

Competing interests: None.

Peer review 1

Alexandra Sarafoglou

DOI: 10.70744/MetaROR.312.1.rv1

Major comments

I am not fully convinced by the manuscript’s framing of its contribution to the debate over whether preregistration hinders exploratory research. The authors position the study as informing this broader discussion, but the proposed design seems able to support conclusions only about reporting practices but not research practice itself. This is important because critiques of preregistration typically concern not just whether exploratory work is conducted (or reported), but whether preregistration constrains creativity and novelty and ultimately hinders the progress of science. Assessing disclosure about exploratory research without also examining these related dimensions therefore seems incomplete.
By the way, the reverse inference is also problematic: even if non-preregistered articles disclose exploratory analyses, this does not mean that the remaining analyses are genuinely confirmatory. This issue is, at present, not addressed in the manuscript.
More broadly, the proposed design appears to conflate research practice with reporting standards. Registered Reports may show a higher frequency of exploratory analyses simply because journals often require such analyses to be reported in a dedicated section, whereas comparable guidance may not exist for regular articles. For the focal sample and the most common journals, it would therefore be important to discuss and, if possible, analyze the relevant reporting guidelines, while taking into account that these policies may have changed over time. For example, the submission guidelines for Psychological Science state for Registered Reports: “It is reasonable that authors may wish to include additional analyses that were not included in the registered submission. For instance, a new analytic approach might become available between IPA and Stage 2 review, or a particularly interesting and unexpected finding may emerge. Such analyses are admissible but must be clearly justified in the text, appropriately caveated, and reported in a separate section of the Results titled ‘Exploratory analyses’.” As far as I can tell, this recommendation does not appear in the same way for regular manuscripts. This raises the question whether the study measures researcher behavior (bottom-up research practice), or simply adherence to journal policy (top-down reporting requirements)? The authors should account for differences across journals and reporting guidelines (also in their statistical model), as any differences across article types will be difficult to interpret otherwise.
For the same reason, the manuscript would be more convincing if it also studied the hypothesized consequences of discouraging exploratory research. The Introduction notes the criticism that preregistration may hinder creativity and reduce novelty, and these are precisely the kinds of outcomes assessed by Soderberg et al. (2021) and O’Mahony. The study would therefore be stronger, and better aligned with the concern raised in the Introduction, if it assessed not only whether exploratory analyses are disclosed, but also how novel, creative, rigorous, and transparent the resulting research is.
Overall, the paper currently reads as a replication of one of O’Mahony’s hypotheses, with the extension to include regular preregistrations. This should be stated more clearly in the Abstract and Introduction. Although such a replication may well be worthwhile, the present proposal would, in my view, require substantial methodological improvement to justify publication. For instance, O’Mahony distinguished exploratory analyses reported in the paper in relation to the studies actually conducted, whereas the justification offered here for coding at the paper level is not yet convincing. In addition, O’Mahony coded several dimensions that seem at least as important as reporting standards alone, including methodological rigour, novelty, creativity, and transparency. At present, the manuscript appears to treat the presence of reported exploratory analyses as a proxy for these broader constructs without measuring them directly, even though they could in principle be incorporated into the coding scheme. If, instead, the authors are primarily interested in reporting practices across publication formats rather than the substantive quality of the research, then it may be worth broadening the coding scheme to include other reporting-related features as well, such as clarity of hypothesis specification, transparency regarding data, materials, and code, etc. In the latter case, this would require a reframing of the Introduction to reflect this shift in focus.
Finally, I do not find the rationale for disregarding preregistration quality convincing. As the authors note, preregistrations can vary substantially in quality, and this variation may plausibly moderate how rigorously exploratory analyses are reported. Treating all preregistrations as equivalent therefore seems difficult to justify, particularly in light of existing evidence that preregistrations differ considerably from one another.

Minor comments:

It is currently somewhat unclear how the dependent variable, that is the presence of exploratory research, will be operationalized. O’Mahony used a proportion score based on the exploratory analyses documented in each paper and the proportion of analyses where this distinction was clearly made. My understanding of the present proposal is that each article will simply be coded for whether exploratory research is reported, which would make the outcome variable binary. The authors should clarify whether this interpretation is correct.
As written, the claim that “the function that preregistration is meant to serve is not always clear or consistent” seems too broad if it is mainly based on a preprint by Lakens (2019). Much of the literature describes the function of preregistration quite consistently as distinguishing confirmatory from exploratory research. For example, Nosek et al. (2018) state in the abstract that preregistration is “an effective solution” because it “distinguishes analyses and outcomes that result from predictions from those that result from postdictions.” Similarly, Munafò et al. (2017) write that preregistration “makes clear the distinction between data-independent confirmatory research … and data-contingent exploratory research.” Parsons et al. (2022) likewise define preregistration as a practice that “aims to clearly distinguish confirmatory from exploratory research.” The authors may therefore wish either to qualify this claim and/or clearly attribute it to the single source (i.e., Lakens, 2019) who claims that there is disagreement about the function of preregistration.
The authors may wish to discuss additional relevant literature alongside the thesis by O’Mahony. Although Soderberg et al. (2021; https://doi.org/10.1038/s41562-021-01142-4) did not examine exploratory work directly, they did evaluate the novelty, creativity, and quality of Registered Reports relative to standard publishing models, and these findings seem directly relevant here. Wagenmakers et al. (2018; https://doi.org/10.1177/1745691618771357) may also warrant discussion, as it provides a useful historical overview of the relationship between creativity and verification.
The OSF materials appear to be incomplete and do not currently include the analysis script.
The abstract contains an incomplete sentence (“The purpose of the present study is to find out.”) and should be revised to state the study aim clearly.
When preregistration is first mentioned in the Introduction, the authors may wish to note that it was proposed as early as 2012 and cite Wagenmakers et al. 2012; https://doi.org/10.1177/1745691612463078). Similarly, when introducing Registered Reports, Chambers (2013; https://doi.org/10.1016/j.cortex.2012.12.016) would be a more appropriate reference than Besançon et al. (2021).

Competing interests: I declare no competing interests with the authors or the publication of the proposed research.

Peer review 2

Marcel van Assen

DOI: 10.70744/MetaROR.312.1.rv2

It was my pleasure to read and review this registered report.

After reading the report, I unfortunately come to the recommendation not to publish this work (in its current state). The criterion I use is the one that is described in the paper: “he Stage 1 manuscript and review process, addressing the question: will this study produce worthwhile knowledge, regardless of the results?” In my opinion the study, as currently planned and described, will not produce worthwhile knowledge. I provide my reasoning below under ‘major comments’.

However, I hope the authors are willing to adapt their proposed work, as I believe it may produce some worthwhile knowledge when (strongly) adapted.

I also provide some minor comments.

Major comments

Claim and framing. Your abstract and your paper state that your research is about the frequency and value of exploratory research:

(a) Most importantly, your research is NOT about the frequency of exploratory research. It currently is about the the reader being able to identify whether research is exploratory or not. This is something very different. Research that is not preregistered does not identify or label research as exploratory or confirmatory, as this distinction is not relevant or cannot be made – I would say almost all of this research is exploratory as ideas and hypotheses are either not formulated before the research or adapted during the research. While O’Mahony reports a rate of 57% disclosure of exploratory research, she also emphasizes that the coding process was subjective and difficult.

Hence, I think the aim needs to be reframed into examining the frequency of the explicit referring to exploratory research rather than the frequency of exploratory research per se.

The framing is then also different, for instance, underlining that it is crucial for the reader to being able to distinguish between exploratory and confirmatory research (as p-values have no clear meaning in exploratory research), to interpret a finding.

(b) The abstract is a bit misleading. Citing from it: “reduce the frequency and value of exploratory research, and therefore restrict creativity, serendipity, and discovery” … “formally examine whether these concerns have any merit”. The use of “these” is misleading, as you do not examine “value of exploratory research, and therefore restrict creativity, serendipity, and discovery”, you only attempt to examine the frequency.

(2) Definition and protocol of exploratory research. On page 6 you define “… exploratory as, “any analysis that is indicated to not be a planned test of a specific hypothesis,””. Given this definition, I expected a protocol for ‘indicated not be a planned test of a specific hypothesis’. Please provide the protocol. For projects like this, and their reproducibility, a clear and objective protocol is essential. Without such a protocol (i) results will not be reproducible, and (ii) results are more difficult to interpret.

To make the protocol more objective and in line with my idea of the research (i.e., the explicit reference to exploratory research), I would use keywords like ‘plann*’ and ‘explor*’ and words that resemble it. This also will make coding easier and less subjective. Particularly because most published research does not list hypotheses or distinguish between planned and exploratory research.

(3) Hypotheses. Please explicitly list your planned hypotheses, also with “H1: …”, “H2:…” etc. Improves the quality of your report, makes clearer what you are doing. We teach our students to do this, but we researchers should do it more in our research too.

Please also take care of your formulation on p7. It is not about “greater reports” but about the prevalence of exploratory research.

Do you or do you not have a hypothesis about registered reports? This is not clear from your intro, as hypotheses are not explicitly listed. And do you have a hypothesis about comparing registered reports to ‘normal’ articles? It is not clear from the text if your hypothesis is of the from A < B < C, which implies three hypotheses on pairwise comparisons, or something else.

Your exploratory hypotheses on p8 are also not clear to me. Do they concern the explicit use of exploratory in ALL published articles (i.e., unconditional) or in registered research (i.e., conditional)? Consider making it explicit.

(4) Ideal registered report. For me, the ideal registered report is one where one simply adds the result section to an existing intro and methods section that is published in Stage 1. Although imo the introduction is quite good, it is not yet a full introduction of the paper (i.e., needs to be adapted). This depends on the journal, what are its criteria for a registered report.

(5) What is a preregistration? On page 9 you define a preregistered paper as a paper with an “accessible link”. Please make this definition more meaningful. I can imagine accessible links to files that do not qualify as preregistrations.

(6) Included studies. You wish to include any type of empirical studies, including qualitative, mixed methods, and meta-analyses. I understand, but (i) some designs (e.g., experimental) are more often preregistered than others, and (ii) results of these different designs may be different, and (iii) coding may be more cumbersome when dealing with more different designs. Because of these reasons I would understand if you focus on just ONE design, for instance, experimental designs. I think that this will also increase the statistical power of your design, as you control for possible confounding factors this way.

If you want to stick to your original choice, then this is of course fine, but it would be great if you can motivate your choice more.

(7) Article coding. You write on page 12 that ‘coding will be done at the article level rather than individual studies within articles’. To me, this sounds like an awful idea. I strongly recommend to code only the preregistered studies as preregistered, and not also the non-preregistered studies in the article. Actually, I think an excellent test of your hypothesis would be to compare preregistered studies and non-preregistered studies in the same article. Within-subject (in your case ‘within-article’) studies provide more control and more power for testing your hypotheses. You can then also compare if, within the same paper, authors distinguish more between exploratory and planned in preregistered studies than in non-preregistered studies.

(8) Rater training and reliability. Great that you include training. But why only six articles? Particularly if you include multiple different designs, six is not much imo. Why not 10 in the first stage? First one needs to develop a clear protocol, and then this protocol is tested in the first stage. Then, the protocol can be adapted, and a new batch can be coded, etc.

On page 15 your write “if agreement exceeds … then the training phase will be complete”. I recommend not to include this rule, as it is an option that agreement never reaches this threshold. Most important is that you do your best with developing the protocol and training, until agreement cannot be improvement much, independent of how much this agreement is.

You also write “Perfect agreement will be assessed…”. What does that mean?

(9) Statistical analyses I did not understand the statistical analyses (planned hypotheses). First, they seemed to be concerned with comparing frequencies, and then a chi2-test is an option. But then you write about ‘comparing averages’. What averages? And averages cannot be compared using a chi2-test. I had difficulties understanding this complete section.

Please make sure that you make clear how you test each of the explicitly tested hypotheses in the intro. Ideally, you also provide the statistical code for these tests.

(10) Power analyses On page 12 you state that “the effect size in O’Mahony (2023) is not reported’. However, earlier in your report you provide the prevalences, with which you can calculate the effect size. The difference between 75 and 57 percent is not 20 percent, so this part confused me.

Please make sure that the power analyses match the statistical tests used for testing the hypotheses (see (9) above).

Perhaps it makes more sense to write the power analysis section following the planned analyses section, as you must describe the statistical analyses first. But perhaps the journal specified this order?

Minor comments

“an accessible and” (p5): consider inserting “supposedly” between ‘a’ and ‘accessible’.

You use “1” twice on page 7 when listing statements. Note that it is sufficient to either explicitly state the H0 or the H1.

I would understand it if my review were perceived as unfavourable and undesired news. Sorry for that. However, I sincerely hope the authors can use my review to improve their research proposal, as more research is needed on this topic.

Keep up the good work!

Best wishes,

Marcel van Assen (I always sign my reviews)

Competing interests: None.

Author response

DOI: 10.70744/MetaROR.XXX.1.ar

As a preface to our comments, we want to highlight the context of this research project. We submitted this research plan as part of Lifecycle Journal’s pilot program that is focused on assessing the usability and feasibility of the service. This program has strict deadlines. The research plan had to be submitted by Dec 1, 2025, with a 60-day evaluation window, and the final outcomes report must be submitted by Sep 1, 2026. This is a very tight timeline, and so we proposed a small project that would be achievable within those constraints. The tight timeline was exacerbated by an extended evaluation process–instead of 60 days from submission, we had all evaluations in hand approximately 135 days after submission.

We offer this explanation up front as a partial rationale for why we declined to take many reviewer suggestions related to expanding the scope of the project. We simply do not have time or capacity to do so. Although this is never a sufficient scientific explanation, we maintain that our planned project, following the revisions based on the helpful comments, will nevertheless provide informative data, even if on a narrow question. Moreover, the dataset of articles we are assembling will have high reuse potential for metascientists, and some of the suggested expansions could certainly be pursued by us or others down the line.

Finally, following submission of this revised research plan we will formally register the project and begin, and thus the submitted version is the final version of the research plan. Accordingly, whereas we are happy to receive additional comments on the revised plan, we will unfortunately not be able to take any of them into consideration as part of the preregistered plan (we could, of course, incorporate some of them as unregistered aspects).

Editorial Assessment

I much appreciated the detailed explanations for the decisions made in the design, which made it much easier to arrive at an informed assessment. That level of detail also seems to have helped the reviewers provide the elaborate feedback they did. Both reviewers agree that the manuscript takes up an important question, but both also think the current proposal does not yet support the claims it wants to make. In particular, the reviewers think that the study can speak to how exploratory analyses are reported or labeled in published papers, but not to whether preregistration and Registered Reports actually reduce exploratory research itself, or diminish creativity, novelty, and serendipity. They also raise related methodological concerns: the definition of exploratory research is still too loose for reliable coding, the coding plan and statistical analyses are not yet sufficiently specified, and some design choices (coding at the article rather than study level, ignoring differences in preregistration quality, and not accounting for journal-specific reporting requirements) make the eventual interpretation of the results difficult. The reviewers are largely aligned on these central points. However, reviewer 1 puts more weight on the broader framing of the debate and on the possibility that the study is really picking up journal policy rather than research practice, while Reviewer 2 focuses more on the need for a tighter protocol and a clearer analysis plan. The manuscript is well written, but some of the language in the abstract and Methods section is imprecise, and that lack of precision contributes to the larger conceptual problem.

Author response: Thank you, we are glad to hear you appreciated the detail. We are grateful for the reviewers’ detailed and helpful feedback, and provide full responses below following the reviewer comments.

Recommendations from the editor

The main revision needed is a clearer match between the paper’s framing and what the proposed design can actually show. At the moment, the study seems able to examine the explicit reporting of exploratory analyses but not the actual frequency or value of exploratory research. The abstract and introduction should reflect that. Additionally, it should not be implied that the study will test whether preregistration restricts creativity, discovery, or serendipity unless those outcomes are going to be measured directly.

Author response: We have now clarified the framing and the scope of the project throughout the text, making it clear that we are analyzing reports of exploratory research vs. the actual practice, per se. We have also added the following to the paper, explaining why we are taking this approach and how there is not really any good alternative:

“A major challenge to drawing any conclusions about the frequency of exploratory research between registered and non-registered articles is that the ability to do so is entirely dependent on researchers’ reporting. Given the widespread prevalence of HARKing, in which exploratory findings are reframed and presented as confirmatory, it is simply not possible to derive an accurate estimate of how often exploratory research is occurring when there are no reporting restrictions in place (Wagenmakers et al., 2018). Thus, when making such comparisons, examining the reporting practices in articles is the best available indicator. Although reports are an imperfect proxy for actual practice, they remain highly relevant to the debate about the status of exploration in registered articles, and specifically address the claims that registration would restrict exploratory research. That is, if registration does in fact restrict exploration, there should be low levels of exploration reported in research articles.”

The manuscript would benefit from an explicit coding protocol that spells out how raters will identify exploratory analyses and what counts as sufficient evidence that an analysis was unplanned. It is difficult to assess whether this important assessment is valid based on the text in the manuscript alone. Both reviewers already indicated some concerns about this validity. Moreover, the coding plan seems to focus mainly on the Introduction, Results/Data Analysis, and Discussion, but in psychology papers, analytic decisions are often not presented in a separate data analysis section but folded into the Methods section. As a practical adjustment, I think it would make sense to also code those parts of the Methods section that describe the analysis plan, even when they are not explicitly labeled as such.

Author response: The full coding protocol is now included in the OSF page. And yes, we agree the descriptions of the analytic plan sometimes appear in the Method section, and so we will review those sections as well.

I would also encourage the authors to reconsider the decision to code at the article level rather than the study level, especially in multi-study papers that may mix preregistered and non-preregistered components. That choice currently weakens the inference the paper wants to make. As reviewer #2 suggests, it may even be informative to do a within-paper analysis, comparing preregistered and non-preregistered studies that way.

Author response: Indeed, this decision evoked strong reactions from the reviewers. We continue to believe that our rationale for coding at the article level is sound, but it is obvious that others do not see it that way. Accordingly, we have revised our inclusion criteria to only consider articles where all studies are preregistered. We will then conduct exploratory within-article examinations of the articles that consist of registered and non-registered studies. Our preliminary article screening has indicated that mixed Registered Reports constitute only a small number of articles, so there are unlikely enough to conduct any hypothesis tests. Excluding them reduces heterogeneity in the focal set, while the exploratory analysis could provide some interesting insights.

A final issue, not mentioned by the reviewers, is the sampling frame. Because the study starts from journals known to publish Registered Reports, the final sample may end up drawing disproportionately from journals that are already more committed to transparency reforms. That does not make the design inappropriate, but it does limit how broadly the findings can be interpreted. One possible way to reduce this problem would be to relax the requirement that companion articles come from the same journal as the Registered Report, although I recognize that doing so also has its downsides.

Author response: We agree that this is a potential limitation. Selecting journals/articles and matches is difficult, and always involves some trade-offs. On balance, we believe it is critical to select matches from the same journals and timeframes, given the variations in journal policies (as identified by the reviewers). As a way to check the implications on our decision, we have added a small sample of matched traditional articles from journals that do not publish Registered Reports. Doing so will provide some indication of the representativeness of the journals selected.

Review #1

Major comments

I am not fully convinced by the manuscript’s framing of its contribution to the debate over whether preregistration hinders exploratory research. The authors position the study as informing this broader discussion, but the proposed design seems able to support conclusions only about reporting practices but not research practice itself. This is important because critiques of preregistration typically concern not just whether exploratory work is conducted (or reported), but whether preregistration constrains creativity and novelty and ultimately hinders the progress of science. Assessing disclosure about exploratory research without also examining these related dimensions therefore seems incomplete.

Author response: As noted in our response to the Editor, we are now clearer about the scope and about why we focus on reporting practices. We are now clearer in the paper that we are narrowly focused on reports of exploratory research.

By the way, the reverse inference is also problematic: even if non-preregistered articles disclose exploratory analyses, this does not mean that the remaining analyses are genuinely confirmatory. This issue is, at present, not addressed in the manuscript.

Author response: Yes, we agree, and made no claims about prevalence of confirmatory analyses. This continues to be the case in the revision.

More broadly, the proposed design appears to conflate research practice with reporting standards. Registered Reports may show a higher frequency of exploratory analyses simply because journals often require such analyses to be reported in a dedicated section, whereas comparable guidance may not exist for regular articles. For the focal sample and the most common journals, it would therefore be important to discuss and, if possible, analyze the relevant reporting guidelines, while taking into account that these policies may have changed over time. For example, the submission guidelines for Psychological Science state for Registered Reports: “It is reasonable that authors may wish to include additional analyses that were not included in the registered submission. For instance, a new analytic approach might become available between IPA and Stage 2 review, or a particularly interesting and unexpected finding may emerge. Such analyses are admissible but must be clearly justified in the text, appropriately caveated, and reported in a separate section of the Results titled ‘Exploratory analyses’.” As far as I can tell, this recommendation does not appear in the same way for regular manuscripts. This raises the question whether the study measures researcher behavior (bottom-up research practice), or simply adherence to journal policy (top-down reporting requirements)? The authors should account for differences across journals and reporting guidelines (also in their statistical model), as any differences across article types will be difficult to interpret otherwise.

Author response: Thank you for raising this important point. We agree about the expectation of reporting for Registered Reports, which is one reason we are including preregistered articles. It is, unfortunately, not feasible to examine how journal policies have changed over time and to map that to specific articles, as journal policies are updated often and rarely archived. As a way to partially address this concern, we will code for current journal policies, as now described in the manuscript, and conduct exploratory analyses on their potential relation. Moreover, in response to the comments from PaperWizard, all analyses will be clustered by journal, so at least we can account for journal-level variation.

For the same reason, the manuscript would be more convincing if it also studied the hypothesized consequences of discouraging exploratory research. The Introduction notes the criticism that preregistration may hinder creativity and reduce novelty, and these are precisely the kinds of outcomes assessed by Soderberg et al. (2021) and O’Mahony. The study would therefore be stronger, and better aligned with the concern raised in the Introduction, if it assessed not only whether exploratory analyses are disclosed, but also how novel, creative, rigorous, and transparent the resulting research is.

Author response: We agree that doing this would be informative. As noted at the outset of this letter, this current project is small scale, focused, and on a very tight external deadline. We believe that generating data related to the narrow claim about registration is still useful. Moreover, there are not good measures available to assess creativity and novelty in these kinds of articles. Soderberg et al. relied on single-item assessments with no descriptive text, and O’Mahony did not assess creativity and novelty. Thus, we will indicate this as a limitation and a useful direction for follow-up work.

Overall, the paper currently reads as a replication of one of O’Mahony’s hypotheses, with the extension to include regular preregistrations. This should be stated more clearly in the Abstract and Introduction. Although such a replication may well be worthwhile, the present proposal would, in my view, require substantial methodological improvement to justify publication. For instance, O’Mahony distinguished exploratory analyses reported in the paper in relation to the studies actually conducted, whereas the justification offered here for coding at the paper level is not yet convincing. In addition, O’Mahony coded several dimensions that seem at least as important as reporting standards alone, including methodological rigour, novelty, creativity, and transparency. At present, the manuscript appears to treat the presence of reported exploratory analyses as a proxy for these broader constructs without measuring them directly, even though they could in principle be incorporated into the coding scheme. If, instead, the authors are primarily interested in reporting practices across publication formats rather than the substantive quality of the research, then it may be worth broadening the coding scheme to include other reporting-related features as well, such as clarity of hypothesis specification, transparency regarding data, materials, and code, etc. In the latter case, this would require a reframing of the Introduction to reflect this shift in focus.

Author response: We agree that these are all good ideas, but seek to keep the scope limited to the narrow question of interest. We have now more clearly indicated that the study is a replication and extension of O’Mahony. Contrary to the reviewer’s claims, O’Mahony did not assess creativity and novelty.

Finally, I do not find the rationale for disregarding preregistration quality convincing. As the authors note, preregistrations can vary substantially in quality, and this variation may plausibly moderate how rigorously exploratory analyses are reported. Treating all preregistrations as equivalent therefore seems difficult to justify, particularly in light of existing evidence that preregistrations differ considerably from one another.

Author response: We agree that examining the contents of the preregistrations would be informative, but we have decided against doing so for two primary reasons. First, it is simply not feasible to do so within the constraints of the project. Second, it is not entirely clear how to think about registration quality in relation to our research questions. As this reviewer and others have noted, we are examining reporting in the articles, and that is useful regardless of registration quality. Moreover, undisclosed deviations from registrations do not really represent exploratory research in its intended sense, but rather are much more likely to represent explorations that were reframed as confirmatory. Although this is of course a type of exploration, it represents something qualitatively different from direct reports of exploratory work.

Minor comments:

It is currently somewhat unclear how the dependent variable, that is the presence of exploratory research, will be operationalized. O’Mahony used a proportion score based on the exploratory analyses documented in each paper and the proportion of analyses where this distinction was clearly made. My understanding of the present proposal is that each article will simply be coded for whether exploratory research is reported, which would make the outcome variable binary. The authors should clarify whether this interpretation is correct.

Author response: We have clarified the coding process in the manuscript and in the appended protocol. It is true that O’Mahony used a proportion score, but the analysis reported just before that used a binary presence rating such as the one we had proposed. As our focus is only on exploratory research, we only include the presence ratings.

As written, the claim that “the function that preregistration is meant to serve is not always clear or consistent” seems too broad if it is mainly based on a preprint by Lakens (2019). Much of the literature describes the function of preregistration quite consistently as distinguishing confirmatory from exploratory research. For example, Nosek et al. (2018) state in the abstract that preregistration is “an effective solution” because it “distinguishes analyses and outcomes that result from predictions from those that result from postdictions.” Similarly, Munafò et al. (2017) write that preregistration “makes clear the distinction between data-independent confirmatory research … and data-contingent exploratory research.” Parsons et al. (2022) likewise define preregistration as a practice that “aims to clearly distinguish confirmatory from exploratory research.” The authors may therefore wish either to qualify this claim and/or clearly attribute it to the single source (i.e., Lakens, 2019) who claims that there is disagreement about the function of preregistration.

Author response: We agree, as noted in the paper, that the exploratory/confirmatory distinction is a primary rationale for preregistration. But it is well-documented that it is not the only rationale. This is well covered in Lakens (2019) as well as Syed (2024), the latter of which briefly summarizes these different rationales:

“One of the challenges of understanding preregistration—and the criticisms of it—are that there are different rationales for why researchers should do it. These include clearly distinguishing between what decisions were made prior to seeing the data (“confirmatory” analyses) from what decisions were made after seeing the data (“exploratory” analyses), and preventing the latter as being framed as the former in a research report ; reducing the prevalence of undisclosed data-dependent decision-making ; evaluating the severity of a test ; and serving as formal documentation of the study design and analysis plan.”

Additionally, we do not see why a single source, which we do not rely on anyway, is a problem, nor is relying on a preprint (as it happens, this was an error in our Zotero library; the paper is indeed published, and this has been updated). Accordingly, we have not made any changes in response to this comment.

The authors may wish to discuss additional relevant literature alongside the thesis by O’Mahony. Although Soderberg et al. (2021; https://doi.org/10.1038/s41562-021-01142-4) did not examine exploratory work directly, they did evaluate the novelty, creativity, and quality of Registered Reports relative to standard publishing models, and these findings seem directly relevant here. Wagenmakers et al. (2018; https://doi.org/10.1177/1745691618771357) may also warrant discussion, as it provides a useful historical overview of the relationship between creativity and verification.

Author response: Thank you, we have added reference to these articles.

The OSF materials appear to be incomplete and do not currently include the analysis script.

Author response: Thank you, the scripts are now available as indicated in the manuscript.

The abstract contains an incomplete sentence (“The purpose of the present study is to find out.”) and should be revised to state the study aim clearly.

Author response: We have revised the sentence for clarity.

When preregistration is first mentioned in the Introduction, the authors may wish to note that it was proposed as early as 2012 and cite Wagenmakers et al. 2012; https://doi.org/10.1177/1745691612463078). Similarly, when introducing Registered Reports, Chambers (2013; https://doi.org/10.1016/j.cortex.2012.12.016) would be a more appropriate reference than Besançon et al. (2021).

Author response: Both of those articles were already cited very early on in the Introduction. We realized that the citation placement of Besançon et al. (2021) was a bit misleading, so have made edits to clarify.

Review #2

It was my pleasure to read and review this registered report.

After reading the report, I unfortunately come to the recommendation not to publish this work (in its current state). The criterion I use is the one that is described in the paper: “he Stage 1 manuscript and review process, addressing the question: will this study produce worthwhile knowledge, regardless of the results?” In my opinion the study, as currently planned and described, will not produce worthwhile knowledge. I provide my reasoning below under ‘major comments’.

However, I hope the authors are willing to adapt their proposed work, as I believe it may produce some worthwhile knowledge when (strongly) adapted.

I also provide some minor comments.

Author response: We appreciate the helpful feedback!

Major comments

Claim and framing. Your abstract and your paper state that your research is about the frequency and value of exploratory research:

(a) Most importantly, your research is NOT about the frequency of exploratory It currently is about the the reader being able to identify whether research is exploratory or not. This is something very different. Research that is not preregistered does not identify or label research as exploratory or confirmatory, as this distinction is not relevant or cannot be made – I would say almost all of this research is exploratory as ideas and hypotheses are either not formulated before the research or adapted during the research. My idea is supported by the results of the dissertation of O’Mahony, which did or could not identify exploratory research in non-preregistered research.

Hence, I think the aim needs to be reframed into examining the frequency of the explicit referring to exploratory research rather than the frequency of exploratory research per se. The framing is then also different, for instance, underlining that it is crucial for the reader to being able to distinguish between exploratory and confirmatory research (as p-values have no clear meaning in exploratory research), to interpret a finding.

Author response: As noted above, we have addressed this major concern. We will note, however, that the reviewer is incorrect about O’Mahony’s findings. It is not true that the study “did or could not identify exploratory research in non-preregistered research.” Rather, they found a rate of 57% reported exploratory research in non-registered research.

(b) The abstract is a bit misleading. Citing from it: “reduce the frequency and value of exploratory research, and therefore restrict creativity, serendipity, and discovery” … “formally examine whether these concerns have any merit”. The use of “these” is misleading, as you do not examine “value of exploratory research, and therefore restrict creativity, serendipity, and discovery”, you only attempt to examine the frequency.

Author response: We have revised the abstract to clarify.

(2) Definition and protocol of exploratory research. On page 6 you define “… exploratory as, “any analysis that is indicated to not be a planned test of a specific hypothesis,””. Given this definition, I expected a protocol for ‘indicated not be a planned test of a specific hypothesis’. Please provide the protocol. For projects like this, and their reproducibility, a clear and objective protocol is essential. Without such a protocol (i) results will not be reproducible, and (ii) results are more difficult to interpret.

Author response: We now add a link to the full coding protocol. Importantly, there is no fully reproducible approach that can be used for this question. As we describe in the manuscript and in the manual, there is necessarily some subjectivity involved given that context must be taken into account. However, more clearly laid out coding procedures do indeed help with interpretation, and we will also make all of the articles and our coding openly available.

To make the protocol more objective and in line with my idea of the research (i.e., the explicit reference to exploratory research), I would use keywords like ‘plann*’ and ‘explor*’ and words that resemble it. This also will make coding easier and less subjective. Particularly because most published research does not list hypotheses or distinguish between planned and exploratory research.

Author response: Thank you, we have added several search terms in the coding protocol. As noted, this process will inherently involve some subjectivity.

(3) Hypotheses. Please explicitly list your planned hypotheses, also with “H1: …”, “H2:…” etc. Improves the quality of your report, makes clearer what you are doing. We teach our students to do this, but we researchers should do it more in our research too.

Author response: Thank you, this point is well taken, and the hypotheses are now more clearly labeled at the end of the Introduction and when discussing the analyses.

Please also take care of your formulation on p7. It is not about “greater reports” but about the prevalence of exploratory research.

Author response: We agree that this formulation is more accurate and have edited the paper throughout.

Do you or do you not have a hypothesis about registered reports? This is not clear from your intro, as hypotheses are not explicitly listed. And do you have a hypothesis about comparing registered reports to ‘normal’ articles? It is not clear from the text if your hypothesis is of the from A < B < C, which implies three hypotheses on pairwise comparisons, or something else.

Author response: Thank you, we have reformulated the hypotheses to make our specific predictions clear.

Your exploratory hypotheses on p8 are also not clear to me. Do they concern the explicit use of exploratory in ALL published articles (i.e., unconditional) or in registered research (i.e., conditional)? Consider making it explicit.

Author response: The planned exploratory tests have now been clarified.

(4) Ideal registered report. For me, the ideal registered report is one where one simply adds the result section to an existing intro and methods section that is published in Stage 1. Although imo the introduction is quite good, it is not yet a full introduction of the paper (i.e., needs to be adapted). This depends on the journal, what are its criteria for a registered report.

Author response: We aim to keep the Introduction as brief as possible. We have revised it based on the comments, but otherwise have not added any additional information.

(5) What is a preregistration? On page 9 you define a preregistered paper as a paper with an “accessible link”. Please make this definition more I can imagine accessible links to files that do not qualify as preregistrations.

Author response: We have revised the criteria to read as follows: “An empirical article in which a) all studies are stated to be preregistered and b) for which the registration document can be located (either via an included link in the manuscript or via manual search). Because of the wide variety in implementation quality, we will not require that the registration document has been formally registered, as long as it can be located (e.g., a registration document is on the OSF project page, but not registered).”

(6) Included studies. You wish to include any type of empirical studies, including qualitative, mixed methods, and meta-analyses. I understand, but (i) some designs (e.g., experimental) are more often preregistered than others, and (ii) results of these different designs may be different, and (iii) coding may be more cumbersome when dealing with more different designs. Because of these reasons I would understand if you focus on just ONE design, for instance, experimental designs. I think that this will also increase the statistical power of your design, as you control for possible confounding factors this way. If you want to stick to your original choice, then this is of course fine, but it would be great if you can motivate your choice more.

Author response: There is no reason to expect exploratory analyses would be more frequent in experimental vs. observational work, so we do not think it is important to restrict inclusion based on these types of designs. However, exploratory analyses could be different for qualitative, mixed methods, and meta-analytic studies, so we will now exclude these from the analysis.

(7) Article coding. You write on page 12 that ‘coding will be done at the article level rather than individual studies within articles’. To me, this sounds like an awful idea. I strongly recommend to code only the preregistered studies as preregistered, and not also the non-preregistered studies in the article. Actually, I think an excellent test of your hypothesis would be to compare preregistered studies and non-preregistered studies in the same article. Within-subject (in your case ‘within-article’) studies provide more control and more power for testing your hypotheses. You can then also compare if, within the same paper, authors distinguish more between exploratory and planned in preregistered studies than in non-preregistered studies.

Author response: As already indicated, we have altered the design to restrict the primary analysis to non-mixed articles and then conduct exploratory within-article analyses on the mixed articles.

(8) Rater training and reliability. Great that you include training. But why only six articles? Particularly if you include multiple different designs, six is not much imo. Why not 10 in the first stage? First one needs to develop a clear protocol, and then this protocol is tested in the first stage. Then, the protocol can be adapted, and a new batch can be coded, etc. On page 15 your write “if agreement exceeds … then the training phase will be complete”. I recommend not to include this rule, as it is an option that agreement never reaches this threshold. Most important is that you do your best with developing the protocol and training, until agreement cannot be improvement much, independent of how much this agreement is. You also write “Perfect agreement will be assessed…”. What does that mean?

Author response: In our many years of coding open-ended text and article data we have found that the detailed procedure works well, so we have stayed with the plan as described. We believe the reviewer may have misread the text, as the sentence in question reads “percent” not “perfect.”

(9) Statistical analyses. I did not understand the statistical analyses (planned hypotheses). First, they seemed to be concerned with comparing frequencies, and then a chi2-test is an option. But then you write about ‘comparing averages’. What averages? And averages cannot be compared using a chi2-test. I had difficulties understanding this complete section. Please make sure that you make clear how you test each of the explicitly tested hypotheses in the intro. Ideally, you also provide the statistical code for these tests.

Author response: This section has been completely overhauled, clearly indicating how each hypothesis is being tested, with links to the analytic code.

(10) Power analyses. On page 12 you state that “the effect size in O’Mahony (2023) is not reported’. However, earlier in your report you provide the prevalences, with which you can calculate the effect size. The difference between 75 and 57 percent is not 20 percent, so this part confused me.

Author response: That sentence meant that the effect size phi was not reported, we have now edited the sentence for clarity. Regarding the percent difference, the sentence reads, “This corresponded to an approximate difference in prevalence of 20%” The actual difference was 18%, and in this sentence were providing an approximation. In any case, this sentence has been removed through revision of the power analysis.

Please make sure that the power analyses match the statistical tests used for testing the hypotheses (see (9) above).

Author response: This section has been completely overhauled, and the power analysis for each test is indicated after the test is described.

Perhaps it makes more sense to write the power analysis section following the planned analyses section, as you must describe the statistical analyses first. But perhaps the journal specified this order?

Author response: As noted, the power analysis for each test is now indicated following the description of the test.

Minor comments

“an accessible and” (p5): consider inserting “supposedly” between ‘a’ and ‘accessible’.

Author response: Changed

You use “1” twice on page 7 when listing statements. Note that it is sufficient to either explicitly state the H0 or the H1.

Author response: The section describing the hypotheses has been completely revised.

I would understand it if my review were perceived as unfavourable and undesired news. Sorry for that. However, I sincerely hope the authors can use my review to improve their research proposal, as more research is needed on this topic.

Keep up the good work! Best wishes,

Marcel van Assen (I always sign my reviews)

Author response: We appreciate the feedback, and the transparency, and believe the comments have greatly improved the project.

Cite