Published at MetaROR

June 20, 2025

Table of contents

Cite this article as:

McKiernan, E., Carter, C., Dougherty, M. R., & Tananbaum, G. (2024, July 31). A framework for values-based assessment in promotion, tenure, and other academic evaluations. https://doi.org/10.31219/osf.io/s4vc5

Curated

Article

A framework for values-based assessment in promotion, tenure, and other academic evaluations

Erin C. McKiernan1,2, Caitlin Carter3, Michael R. Dougherty4, Greg Tananbaum2

1. Departamento de Física, Facultad de Ciencias, Universidad Nacional Autónoma de México
2. Open Research Community Accelerator (ORCA)
3. Higher Education Leadership Initiative for Open Scholarship (HELIOS Open)
4. Department of Psychology, University of Maryland

Originally published on August 27, 2024 at: 

Abstract

Recent years have seen a growing dissatisfaction with how academics and their scholarly work are evaluated, and a corresponding global proliferation of initiatives dedicated to assessment reform. A common theme across many of these initiatives is a call to center values, focusing on how incentives could be designed to better reward aspects like collaboration, equity, rigor, and transparency. While such values-based approaches have laid solid groundwork for academic institutions to think through and prioritize their values, we see a need for granular tools that can help institutions transform their values into actionable reforms. To that end, we present a framework, developed in part through workshops we ran at the 2023 Council of Graduate Departments of Psychology (COGDOP) Annual Meeting, the Association for Psychological Science (APS) Annual Convention, and the American Anthropological Association (AAA) Department Leaders Summer Institute. The framework includes 14 values (e.g. creativity, inclusivity, engagement, public good), and for each value, we outline some scoping considerations, representative academic activities or scholarly outputs, and possible behavioral indicators that could be incorporated into promotion and tenure evaluations. This framework is not exhaustive, and will likely vary depending on disciplinary or other contexts, but we hope it will serve as a starting point and encourage institutions to tackle assessment reforms with a values-based lens.

Introduction

“What’s really being called into question is the reward system and the key issue is this: what activities of the professoriate are most highly prized? After all, it’s futile to talk about improving the quality of teaching if, in the end, faculty are not given recognition for the time they spend with students.”

Those words were written by Ernest Boyer in 1990 in his seminal book, Scholarship Reconsidered: Priorities of the Professoriate [1]. Today, what is called the Boyer model is cited as the basis of many promotion and tenure criteria at institutions throughout the United States. Fundamental to Boyer’s model was the idea that if there are faculty behaviors that institutions want to see, or characteristics (e.g. quality of teaching) they would like to strengthen, then they must recognize and evaluate faculty on those elements. What Boyer was proposing was essentially a system in which institutions reward what they value. However, while Boyer’s work has been influential in broadening what is considered scholarship, some have argued that there are key parts of his model where current academic evaluation systems still fall short [2, 3].

“…while Boyer’s broadened view of scholarship has remained a significant part of the national conversation on higher education…it has perhaps done so more in spirit than in practice. Published research is still the dominant lens through which scholarship is understood. Academic disciplines are still siloed, and research that pushes the boundaries of traditional department structures is still discouraged.” [3]

Our intent is not to over-index on Boyer, but rather to point out that academia is not achieving in practice the very type of assessment it holds up as the ideal. This is not a new issue – in 1975, Steven Kerr wrote, “Society hopes that professors will not neglect their teaching responsibilities but rewards them almost entirely for research and publications”, and called this the “folly of rewarding A, while hoping for B” [4]. There is evidence that evaluation criteria are often in contradiction to purported institutional values [5, 6], and there is increasing recognition that current academic incentives are misaligned at best and perverse at worst [7, 8]. However, from this recognition have grown multiple global efforts to improve incentives.

Global initiatives in academic incentive reform

The past decade or so has seen a surge in initiatives dedicated to academic incentive reform. In 2013, the San Francisco Declaration on Research Assessment (DORA) was published, urging funders, institutions, publishers, and more to move away from journal-based metrics like Journal Impact Factor [9]. Moreover, DORA encouraged actors evaluating academic work to “consider the value and impact of all research outputs…and consider a broad range of impact measures including qualitative indicators of research impact, such as influence on policy and practice”. At present, more than 21K individuals and 3K organizations in 165 countries have signed DORA, signaling at least a conceptual alignment with these principles. Over the years, DORA has grown into a vibrant community, sharing emerging best practices in research assessment, and developing tools like the SPACE rubric [10] to help institutions rethink their academic incentives.

In 2015, Diana Hicks and colleagues published the Leiden Manifesto, urging less reliance on quantitative bibliometrics and highlighting the importance of accompanying “qualitative expert assessment” [11]. In addition, this framework proposes that evaluators “[m]easure performance against the research missions of the institution, group or researcher”. Since its publication, a number of universities in the U.S., Canada, and Europe have endorsed the principles within the Leiden Manifesto, and it has been held up as an guiding framework for assessment reform by international bodies such as the European Commission [12].

In 2020, David Moher and colleagues published the Hong Kong Principles (HKPs) for assessing researchers, “with a focus on rewarding behaviors that strengthen research integrity that have an emphasis on responsible research practices and the avoidance of detrimental research practice” [13]. As with DORA and Leiden, the HKPs outline problems with quantitative metrics like publication or citation counts, but also explain about how these metrics are ill-suited for evaluating rigor or other aspects like public involvement in research. The HKPs instead propose valuing practices like transparent reporting, open science, and a “broad range of research and scholarship”.

The intersection between open science and academic incentive reform, as highlighted by the HKPs, is an area that has shown increasing momentum in recent years. In 2019, the National Academies established the Roundtable on Aligning Incentives for Open Science (now Open Scholarship) [14]. In 2021, the Roundtable published a toolkit, which includes a worksheet on ‘Reimagining Outputs’ designed to both enumerate the broad range of scholarly products that should be considered in academic assessments, and explain how they can be shared [15]. In 2022, the Higher Education Leadership Initiative for Open Scholarship (HELIOS Open) [16] emerged from the National Academies Roundtable. HELIOS Open is working with campus leaders (presidents, provosts, and vice presidents, along with their delegates) to modernize tenure, promotion, review, and hiring to explicitly reward open scholarship activities across member campuses [17]. Other initiatives have also highlighted the importance of reforming academic assessment to incentivize open science practices. For example, UNESCO’s 2021 Recommendation on Open Science includes as a major focus the need to “[reward] good open science practices” and “[encourage] responsible research and researcher evaluation and assessment practices, which incentivize quality science, recognizing the diversity of research outputs, activities and missions” [18].

Among the most recent advances was the 2022 launch of the Coalition for Advancing Research Assessment (CoARA) and their published agreement [19]. Signatories of the agreement commit to working toward research assessment reform in accordance with a number of guiding principles, which include considering a broad range of research outputs, recognizing diverse research contributions and roles, and focusing on research quality. On the latter, they write, “Quality implies that research is carried out through transparent research processes and methodologies and through research management allowing systematic re-use of previous results. Openness of research, and results that are verifiable and reproducible where applicable, strongly contribute to quality”. At present, over 700 organizations in dozens of countries have joined CoARA, and members have organized into working groups focused on ’Experiments in Research Assessment’, ’Responsible Metrics and Indicators’, and ’Inclusive Evaluation of Research’, among other topics.

The above is a non-exhaustive list of the numerous initiaves in academic or research assessment reform. For more in-depth information, we encourage readers to explore the many examples available via DORA’s Resource Library [20] and Reformscape [21], and also read [22–24].

Values-based academic assessment

Many of the reform initiatives discussed above, whether explicit or not, are to some extent values-based. For example, the HKPs are focused on incentivizing rigor and transparency, and CoARA names quality, impact, diversity, inclusivity, and collaboration as guiding principles in research assessment. However, some initiatives take an even more explicit values-based approach. For example, a 2023 report from Science Europe reads, “…there is growing consensus that our current systems do not always reflect our shared values (rewards systems promoting individualism rather than collaboration as an example)”, and proposes what they call their values framework for evaluation that emphasizes autonomy or freedom; care and collegiality; collaboration; equality, diversity, and inclusion; integrity and ethics; and openness and transparency [25].

Some of the recommendations included in initiatives like CoARA are based on important tools in the research assessment space like the SCOPE Framework, which is values-based at its core. SCOPE was developed in 2021 by the International Network of Research Management Societies (INORMS) Research Evaluation Group (REG). The ’S’ in SCOPE stands for “Start with what you value”, and encourages institutions to think about what values are important to them as the bedrock for any type of evaluation. A recent article outlines more details about SCOPE and includes four case studies on how organizations in Canada and the UK are using this framework in different contexts to encourage assessment reform [26].

One of the biggest proponents of values-based academic assessment has been the Humane Metrics in Humanities and Social Sciences (HuMetricsHSS) initiative, launched in 2017. In their own words, HuMetricsHSS began by asking the question, “What would it look like to start to measure what we value, rather than valuing only what we can readily measure?” [27]. They have outlined how current assessment criteria create perverse incentives [7], and their survey results show that faculty feel this misalignment [27]. In response, they have developed a high-level values framework [28], a workshop kit [29], and an interactive values sorter [30] to help faculty and institutions identify their values and work these into evaluative processes.

Initiatives like HumetricsHSS have succeeded in setting the broad parameters for values-based assessment. However, for many, these ideas remain largely abstract. Our own experiences, and discussions with other academics, have shown us that it can be challenging to conceptualize how values like accessibility or quality translate into measurable behaviors which can be evaluated. In other words, there is a need to develop more actionable granularity that could help turn theoretical values frameworks into practical reforms. We began this process by workshopping ideas for values-based assessment with academics in the fields of psychology and anthropology.

Workshopping ideas for values-based assessment

Motivated to help spur change in academic assessment, in 2023 we ran a series of workshops, primarily for department chairs. We described these workshops in detail, including their structure and strategies, in a previous article [31]. We reproduce select and lightly modified portions of that text here under the terms of the Creative Commons Attribution (CC BY) license.

The workshops included two in-person sessions at the 2023 Council of Graduate Departments of Psychology (COGDOP) Annual Meeting in February; one in-person workshop at the Association for Psychological Science (APS) Annual Convention in May; and one online workshop as part of the American Anthropological Association (AAA) Department Leaders Summer Institute in June. We believe it was key to the success of these events that they were convened by leading professional societies because of the special role they play in identifying, articulating, and socializing appropriate norms within their disciplines.

The overall perspective of these workshops was inspired by our prior work on studying (e.g. [5, 32] and re-aligning incentives [33], as well as the work of HuMetricsHSS, all of which seek to reframe evaluation processes and promote a values-based approach [27]. For all workshops, we implemented a mixture of short talks, guided discussions, and breakouts. We began with brief presentations to set the stage by highlighting research that shows problems with current academic incentives (i.e. ‘the system is broken’). Next, we placed the proposed reform work in a broader context by talking briefly about ongoing U.S. national and international reform efforts (i.e. ‘you are not alone’). Then, we showed an example of success from author MRD who, as Chair of the Department of Psychology at the University of Maryland, was able to shepherd reforms to their promotion and tenure guidelines in 2022 [34] (i.e. ‘you can do this’). Finally, we talked about disciplinary considerations, and the idea that departments can customize both the values they want to focus on and the behaviors or indicators to fit their needs (i.e. ‘enabling multiple pathways to success’). We then moved into breakouts, where participants discussed what each value meant to them, and how they might measure these values via behaviors.

For both guided discussions and breakouts, we took a subset of the proposed values from HuMetricsHSS, focusing on the ones deemed most relevant for both the discipline and, where applicable, the underlying theme of the conference or session we were running (more information in [31]). These values included accessibility, diversity, engagement, equity, inclusivity, openness, public good, quality, reproducibility, rigor, and transparency. From these workshops, we collected participants’ notes, as well as our own ideas and impressions from the discussions. We then used these to inform and build our larger framework.

Building the framework

To build our framework, we started with the values to be included, pulling from those we workshopped in the sessions described above, additional ones taken from the sorting cards in the HuMetricsHSS workshop kit [29] (e.g. leadership), and others that workshop participants said were important (e.g. mentorship). We ended up with 14 values in total, including: (1) accessibility, (2) advancing knowledge, (3) collaboration & partnership, (4) communication, (5) creativity & originality, (6) diversity & inclusivity, (7) engagement, (8) equity, (9) leadership, (10) mentorship, (11) openness & transparency, (12) public good, (13) quality & rigor, and (14) reproducibility & verifiability. We excluded some values like competition or collegiality because both our own work [35] and discussions with workshop participants indicated they were potentially problematic. Collegiality, for example, is often vaguely defined in promotion and tenure guidelines [35], and can be applied in ways that reinforce biases against women and faculty of color [36]. We also collapsed some values, like quality and rigor, because our discussions showed these to be largely conceptually overlapping. Further work could explore if it makes sense to split these, especially for certain disciplines or types of scholarly work. Also, this is not an exhaustive list of values. We encourage specific departments, disciplines, and institutions to work through whether these values are the ones that most resonate with them, removing some or adding others as fits their mission. The values included may also depend on the goals of the assessment, e.g. hiring versus tenure. In other words, this framework should not be viewed as set in stone, but rather as a starting point for others to modify and build upon. However, there were some general principles arising from the workshops (and other academic consultations) that guided us in fleshing out the framework.

Embrace the breadth of definitions and contexts for each value

As expected, something we saw in our workshops is that people can have very different ideas of what certain values mean or how they are embodied. As an example, when asked about accessibility, participants’ responses included concepts as wide-ranging as making educational resources accessible in different languages, public accessibility of research outputs like data, accessibility for those with physical disabilities through assistive technologies, and even access to opportunities through mentoring programs. Responses also varied depending on the context. For example, participants had different ideas about diversity and inclusivity when applied to research versus teaching. We see this variability as a feature, rather than a bug, and attempt to embrace it throughout the framework. To do this, we include framing questions (e.g. ’What types of accessibility are important?’ or ’What aspects of academic work should demonstrate quality?’), and also multiple activities or behaviors that could demonstrate each value. Thus, there is no one way of assessing a certain value, but rather a suite of indicators that academics can consider.

Move away from quantitative metrics and towards behavioral indicators

Throughout our workshop discussions, there was general agreement that commonly-used bibliometrics like Journal Impact Factor (JIF) or citation counts do not effectively capture values like engagement. In addition, we presented participants with some of the mounting evidence that bibliometrics like these are not only poor proxy measures of other values like quality [37, 38], but are also biased and potentially damaging [39–41], working in contrast to values like diversity and inclusivity. Thus, in the same vein as many existing reform initiatives, we designed our framework to discourage the use of such metrics. Instead, we focused on behavioral indicators by using action-oriented language, such as ’share’, ’use’, ’develop’, and ’participate’. For example, if we want to assess engagement, we can ask whether academics are developing participatory learning practices (engaging students), or leading citizen science projects (engaging the community). While behavioral indicators are not perfect, and may by their nature be more difficult to measure, they will ultimately be more nuanced and informative. Such an approach would also help to focus less on numbers of products (“performance as results”), and more on processes and outcomes of scholarly work (“performance as behaviors”) [33].

Leave the research, teaching, and service ’buckets’ behind

Many academic evaluation frameworks divide faculty activities into research, teaching, and service, and weigh each differently, with research highly valued and service often woefully undervalued [42, 43]. As we presented to participants, service falls disproportionately on the shoulders of women and faculty of color [44–46], which is both inequitable and a mission problem if institutions lose or drive out the people who embody values like diversity and inclusivity [47]. Participants confirmed this undervaluation of service work, and we encouraged them in the breakouts to think beyond the ’three-legged stool’ model, and instead explore how the fuller breadth of faculty activities contribute in different ways across multiple values. Admittedly, there are parts of the framework where it makes sense to break some behaviors into their research and teaching components (e.g. when looking at quality), but the focus is on the value and not on the ’buckets’ themselves. Furthermore, we do not use the concept of service, but rather incorporate such activities throughout the framework to emphasize how they can exemplify values like communication, engagement, mentorship, and public good. We see this as a way to both recover the real motivations behind much service work and also elevate its status in academic assessment.

Enable multiple pathways to success in academic evaluations

At first glance, our framework could seem overwhelming, with 14 values, multiple behaviors under each, and the potential to add more. However, a common theme articulated across the workshops was that certain values and behaviors may be more relevant for some disciplines than others. The idea is for departments, institutions, funders, or other evaluators to use this as a ’buffet menu’, pulling together unique combinations of values and behaviors that are relevant for their assessment needs, and thereby defining what success means for them. In addition, moving away from the research, teaching, and service buckets, and their uneven weighting, would facilitate recognition of other faculty profiles that do not depend primarily on research activity, but rather allow for multiple individual pathways to success. This could help departments specify the different profiles they are looking for, or even empower academics to develop their own unique profiles based on their interests or strengths. For example, one faculty member might demonstrate strong communication, transparency, and public good, whereas another might excel in accessibility, inclusivity, and mentorship. Recognizing more diverse contributions to the academic enterprise is key to several current incentive reform initiatives (e.g. CRediT Taxonomy [48] and CoARA [19]).

Putting the pieces together

We take the example of public good to illustrate in more detail how the framework is constructed (Table 1). In workshopping this value, a question discussed by participants, was, ‘Who is the relevant public?’ There was recognition that answers can vary depending on institution type (e.g., land-grant, minority-serving), and where it is located or where faculty work is carried out (e.g. urban or rural areas, or low- and middle-income countries). Another question, exemplified by the range of participant responses, was, ‘What kinds of good are relevant?’. This could depend on the type of faculty work being evaluated, and the institution’s focus (e.g. research-intensive, teaching college). Thus, directly following the value, we pose two questions designed to prompt institutions to think through such considerations: (1) What population(s) does the academic institution serve?; and (2) Which aspects of academic work should benefit the public?

Potential answers to question two, and guided also by light clustering of participant responses, gave some broad categories to be considered, i.e. education, outreach, and research (Table 1). In the next column, we list examples of faculty activities or behaviors that could fall under each, with some taken from participant responses, others from additional consultations, and some from our own experiences with evaluation. For example, if we consider education, sharing educational resources could be one way to generate public good, whereas for research, community-engaged projects could be another. Finally, the last column is more granular in terms of what an evaluator might look for. In community-engaged research, for example, the behavioral indicator is, “Academics ensure their research has relevance and engages or produces real solutions by involving community participants in co-design, data collection, translation of results, etc.” Who the relevant community participants are will depend on the answers to question one above.

How could the behavioral indicator for community-engaged research be measured? Consistent with our guiding principles, it will not lend itself well to quantitative metrics, but will instead require a suite of qualitative ‘measures’ that enable researchers to make visible those aspects of their work that have historically been invisible. Broadly, this could include self-report mechanisms like moving to a narrative [49, 50] or annotated [33] CV format that allows faculty to better document project participant roles, report how they engaged the community, and document the full scope of work products developed and publicly shared. To both complement and mitigate potential issues with self-reporting (e.g. ‘over-selling’, or inequities in the skills, time, and training necessary to craft compelling narratives [51]), evaluators could also look to other sources of information, such as whether community members were included as co-PIs or collaborators on faculty grants, or external letters from community collaborators. Importantly, we do not mean that every faculty member has to be involved in community-engaged research to be successful when assessed on public good. This may be highly relevant for some disciplines, and less so for others. The activities and indicators listed are also not the only ways that academics could think about public good. Again, our hope is that the framework is a starting point, and that the thinking outlined here will help others conceptualize how to add their own values, considerations, and behaviors.

Table 1: Excerpt from values-based framework

Values

Considerations

Activities or behaviors

Indicators

Public good

What population(s) does the academic institution serve?

Which aspects of academic work should benefit the public?

Education

Sharing educational resources

Academics share educational materials, like class notes, authored textbooks, and videos, via public platforms that can be freely accessed and materials used by local schools.

Outreach

Institution- or mission-aligned outreach

Academics participate in outreach activities, especially to engage or inspire the populations they serve as

Minority-Serving Institutions, Land- grant universities, or other goals aligned with their institutional mission.

Research

Applied research

Academics conduct research of significance to society, addressing problems of relevance to local, regional, or international communities.

Community-engaged research

Academics ensure their research has relevance and engages or produces real solutions by involving community participants in co-design, data collection, translation of results, etc.

Research sharing and communication with the public

Academics share their results with the public, communicating with

understandable language that facilitates reuse of information.

Our full framework is on Open Science Framework at https://doi.org/10.17605/OSF.IO/Z9XSR, and is also part of our larger Toolkit for Aligning Incentives at https://osf.io/na7dh/. It is openly licensed (CC BY) to allow others to remix and reuse. We also include a supplemental document (https://osf.io/n6jx2), which lists resources from other groups with more in-depth recommendations on how to assess elements of this framework, like mentorship or community-engaged scholarship.

Conclusions and future work

We have developed a detailed values-based framework that we hope will help academics see the many possibilities, and view academic incentive reform as a climbable hill. Rather than supplanting any existing reform initiatives, we have both built on and aim to complement ongoing efforts like DORA, CoARA, HuMetricsHSS, and others. In addition, we see this framework not as a static tool but one that can grow and change with community input. In particular, we envision a situation in which there may be multiple ‘working versions’ tailored and refined for different disciplines. To this end, our next steps include reaching out to professional societies as disciplinary leaders and standard-setting organizations in their respective fields. We believe an integrated approach – where societies offer field-specific context to inform which values are prioritized, and departments use both this guidance and their knowledge of specific institutional norms to customize guidelines and rubrics – will generate significant impact.

Acknowledgments

We thank all the participants of the 2023 AAA, APS, and COGDOP workshops for their time and insights, which led to the development of this framework. We also thank Chris Marcum for feedback on an earlier version of our framework.

Funding

This work was supported in part by a grant from the Templeton World Charity Foundation (Grant DOI: https://doi.org/10.54224/33356).

References

[1]   E.L. Boyer. Scholarship Reconsidered: Priorities of the Professoriate. The Carnegie Foundation for the Advancement of Teaching, 1990.

[2]   K. O’Meara. How Scholarship Reconsidered disrupted the promotion and tenure system. In E.L. Boyer, D. Moser, T.C. Ream, and J.M. Braxton, editors, Scholarship Reconsidered: Priorities of the Professoriate, Expanded Edition, pages 41–48. John Wiley & Sons, 2015.

[3]   D. Moser and T.C. Ream. Scholarship reconsidered: Past, present, and future. About Campus, 20(1):20–24, 2015. https://doi.org/10.1002/abc.21181.

[4]   S. Kerr. On the Folly of Rewarding A, While Hoping for B. Academy of Management Journal, 18(4):769–783, 1975. https://www.jstor.org/stable/255378.

[5]   J.P. Alperin, C. Muñoz Nieves, L.A. Schimanski, G.E. Fischman, M.T. Niles, and E.C. McKiernan. How significant are the public dimensions of faculty work in review, promotion and tenure documents? eLife, 8:e42254, 2019. https://doi.org/10.7554/eLife.42254.

[6]   N. Pontika, T. Klebel, A. Correia, H. Metzler, P. Knoth, and T. Ross-Hellauer. Indicators of research quality, quantity, openness, and responsibility in institutional review, promotion, and tenure policies across seven countries. Quantitative Science Studies, 3(4):888–911, 2022. https://doi.org/10.1162/qss_a_00224.

[7]   N. Agate, R. Kennison, S. Konkiel, C.P. Long, J. Rhody, S. Sacchi, and P. Weber. The transformative power of values-enacted scholarship. Humanities and Social Sciences Communications, 7(1):1–12, 2020. https://doi.org/10.1057/s41599-020-00647-z.

[8]   M.A. Edwards and S. Roy. Academic research in the 21st century: Maintaining scientific integrity in a climate of perverse incentives and hypercompetition. Environmental Engineering Science, 34(1):51– 61, 2017. https://doi.org/10.1089/ees.2016.0223.

[9]   American Society for Cell Biology. San Francisco Declaration on Research Assessment (DORA), 2013. Available from https://sfdora.org/read/.

[10]   R. Schmidt, S. Curry, and . Hatch. Creating SPACE to evolve academic assessment. eLife, 10:e70929, 2021. https://doi.org/10.7554/eLife.70929 .

[11]   D. Hicks, P. Wouters, L. Waltman, S. De Rijcke, and I. Rafols. Bibliometrics: the Leiden Manifesto for research metrics. Nature, 520(7548):429–431, 2015. https://doi.org/10.1038/520429a.

[12]   European Commission, Directorate-General for Research and Innovation. Towards a reform of the research assessment system – Scoping report, 2021. https://data.europa.eu/doi/10.2777/707440.

[13]   D. Moher, L. Bouter, S. Kleinert, P. Glasziou, M.H. Sham, V. Barbour, A-M. Coriat, N. Foeger, and U. Dirnagl. The Hong Kong Principles for assessing researchers: Fostering research integrity. PLoS Biology, 18(7):e3000737, 2020. https://doi.org/10.1371/journal.pbio.3000737.

[14]   National Academies of Sciences, Engineering, and Medicine. Roundtable on Aligning Incentives for Open Scholarship. Available from https://www.nationalacademies.org/our-work/roundtable-on- aligning-incentives-for-open-science.

[15]   National Academies of Sciences, Engineering, and Medicine. Developing a toolkit for fostering open science practices: Proceedings of a Workshop. The National Academies Press, 2021. https://doi.org/10.17226/26308.

[16]   HELIOS Open. Higher Education Leadership Initiative for Open Scholarship. Available from https://www.heliosopen.org/.

[17]   HELIOS Open. Higher Education Leaders Convene to Explore Modernizing Hiring, Review, Promotion and Tenure to Explicitly Reward Open Scholarship. 2024. https://www.heliosopen.org/news/higher- education-leaders-convene-to-explore-modernizing-hiring-review-promotion-and-tenure-to-explicitly- reward-open-scholarship.

[18]   UNESCO. Recommendation on open science, 2021. https://doi.org/10.54677/MNMH8546.

[19]   CoARA.          Agreement    on    reforming    research                      assessment, 2022.                 Available    from https://coara.eu/app/uploads/2022/09/2022_07_19_rra_agreement_final.pdf.

[20]   DORA. Resource Library. Available from https://sfdora.org/resource-library/.

[21]   DORA. Reformscape. Available from https://sfdora.org/reformscape/.

[22]   S. Curry, S. De Rijcke, A. Hatch, D.G. Pillay, I. Van der Weijden, and J. Wilsdon. The changing role of funders in responsible research assessment: progress, obstacles and the way ahead. Research on Research Institute (RoRI) Working Paper, 2020. https://doi.org/10.6084/m9.figshare.13227914.v1.

[23]   B. Kramer and J. Bosman. Recognition and rewards in academia –recent trends in assessment. In M. Thunnissen and P. Boselie, editors, Talent Management in Higher Education, pages 55–75. Emerald Publishing Limited, 2024.

[24]   D. Moher, F. Naudet, I.A. Cristea, F. Miedema, J.P.A. Ioannidis, and S.N. Goodman.  Assessing scientists for hiring, promotion, and tenure. PLoS Biology, 16(3):e2004089, 2018. https://doi.org/10.1371/journal.pbio.2004089.

[25]   Science Europe. Recognising what we value: Recommendations on recognition systems, 2023. https://doi.org/10.5281/zenodo.7858100.

[26]   L. Himanen, E. Conte, M. Gauffriau, T. Strøm, B. Wolf, and E. Gadd.  The SCOPE framework – implementing the ideals of responsible research assessment. F1000Research, 12:1241, 2023. https://doi.org/10.12688/f1000research.140810.2.

[27]   N. Agate, C.P. Long, B. Russell, R. Kennison, P. Weber, S. Sacchi, J. Rhody, and B.T. Dill. Walking the talk: Toward a values-aligned academy. Humanities Commons, 2022. https://doi.org/10.17613/06sf- ad45.

[28]   HuMetricsHSS. Values Framework. Available from https://humetricshss.org/our-work/values/.

[29]   HuMetricsHSS. Workshop Kit. Available from https://humetricshss.org/your-work/workshop-kit/.

[30]   HuMetricsHSS. The Values Sorter. Available from https://humetricshss.org/values-sorter/.

[31]   C. Carter, M.R. Dougherty, E.C. McKiernan, and G. Tananbaum. Promoting values-based assessment in review, promotion, and tenure processes. Commonplace, Series 3.2 Recognition & Rewards, 2023. 10.21428/6ffd8432.9eadd603.

[32]   M.T. Niles, L.A. Schimanski, E.C. McKiernan, and J.P. Alperin. Why we publish where we do: Faculty publishing values and their relationship to review, promotion and tenure expectations. PLOS ONE, 15(3):e0228914, 2020. https://doi.org/10.1371/journal.pone.0228914.

[33]   M.R. Dougherty, L.R. Slevc, and J.A. Grand. Making research evaluation more transparent: Aligning research philosophy, institutional values, and reporting. Perspectives on Psychological Science, 14(3):361–375, 2019. https://doi.org/10.1177/1745691618810693.

[34]   University of Maryland. Department of Psychology. Department Policies and Initiatives, 2022. Available from https://psyc.umd.edu/about-us/department-policies-and-initiatives.

[35]   D. Dawson, E. Morales, E.C. McKiernan, L.A. Schimanski, M.T. Niles, and J.P. Alperin. The role of collegiality in academic review, promotion, and tenure. PLOS ONE, 17(4):e0265506, 2022. https://doi.org/10.1371/journal.pone.0265506.

[36]   L.W.M. Ward, L.M. Cate, and K.S. Ford. Culture of hegemonic collegiality: Pre-tenure women faculty experiences with the “fourth bucket”. The Review of Higher Education, 47(2):217–243, 2023. https://doi.org/10.1353/rhe.2024.a914961.

[37]   B. Brembs, K. Button, and M. Munafò.  Deep impact: unintended consequences of journal rank. Frontiers in Human Neuroscience, 7:45406, 2013. https://doi.org/10.3389/fnhum.2013.00291.

[38]   M.R. Dougherty and Z. Horne. Citation counts and journal impact factors do not capture some indicators of research quality in the behavioural and brain sciences. Royal Society Open Science, 9(8):220334–220334, 2022. https://doi.org/10.1098/rsos.220334.

[39]   American Psychological Association, APA Task Force on Inequities in Academic Tenure and Promotion. APA Task Force Report on Promotion, Tenure, and Retention of Faculty of Color in Psychology, 2023. Available from https://www.apa.org/pubs/reports/inequities-academic-tenure-promotion.pdf.

[40]   E.G. Teich, J.Z. Kim, C.W. Lynn, S.C. Simon, A.A. Klishin, K.P. Szymula, P. Srivastava, L.C. Bassett, P. Zurn, J.D. Dworkin, and D.S. Bassett. Citation inequity and gendered citation practices in contemporary physics. Nature Physics, 18(10):1161–1170, 2022. https://doi.org/10.1038/s41567-022-01770-1.

[41]   C.A. Chapman, J.C. Bicca-Marques, S. Calvignac-Spencer, P. Fan, P.J. Fashing, J. Gogarten, S. Guo, C.A. Hemingway, F. Leendertz, B. Li, I Matsuda, R. Hou, J.C. Serio-Silva, and N.C. Stenseth. Games academics play and their consequences: How authorship, h-index and journal impact factors are shaping the future of academia. Proceedings of the Royal Society B, 286(1916):20192047, 2019. https://doi.org/10.1098/rspb.2019.2047.

[42]   L.A. Schimanski and J.P. Alperin.   The evaluation of scholarship in academic promotion and tenure processes: Past, present, and future. F1000Research, 7:1605, 2018. https://doi.org/10.12688%2Ff1000research.16493.1.

[43]   D. Harley, S.K. Acord, S. Earl-Novell, S. Lawrence, and C.J. King. Assessing the future landscape of scholarly communication: An exploration of faculty values and needs in seven disciplines. UC Berkeley: Center for Studies in Higher Education, 2010. https://escholarship.org/uc/item/15x7385g.

[44]   C.R. Domingo, N.C. Gerber, D. Harris, L. Mamo, S.G. Pasion, R.D. Rebanal, and S.V. Rosser. More service or more advancement: Institutional barriers to academic success for women and women of color faculty at a large public comprehensive minority-serving state university. Journal of Diversity in Higher Education, 15(3):365, 2022. https://doi.org/10.1037/dhe0000292.

[45]   C.M. Guarino and V.M.H. Borden. Faculty service loads and gender: Are women taking care of the academic family? Research in Higher Education, 58:672–694, 2017. https://doi.org/10.1007/s11162- 017-9454-2.

[46]   L.K. Hanasono, E.M. Broido, M.M. Yacobucci, K.V. Root, S. Peña, and D.A. O’Neil. Secret service: Revealing gender biases in the visibility and value of faculty service. Journal of Diversity in Higher Education, 12(1):85, 2019. https://doi.org/10.1037/dhe0000081.

[47]   M.F. Jimenez, T.M. Laverty, S.P. Bombaci, K. Wilkins, D.E. Bennett, and L. Pejchar. Underrepresented faculty play a disproportionate role in advancing diversity and inclusion. Nature Ecology & Evolution, 3(7):1030–1033, 2019. https://doi.org/10.1038/s41559-019-0911-5.

[48]   L. Allen, A. O’Connell, and V. Kiermer. How can we ensure visibility and diversity in research contributions? how the contributor role taxonomy (credit) is helping the shift from authorship to contributorship. Learned Publishing, 32(1), 2019. https://doi.org/10.1002/leap.1210.

[49]   F. Bordignon, L. Chaignon, and D. Egret. Promoting narrative cvs to improve research evaluation? a review of opinion pieces and experiments. Research Evaluation, 32(2):313–320, 2023. https://doi.org/10.1093/reseval/rvad013.

[50]   M. Pietilä, J. Kekäle, and K. Rintamäki. Broadening the conception of ‘what counts’–example of a narrative cv in a university alliance. In 27th International Conference on Science, Technology and Innovation Indicators (STI 2023). International Conference on Science, Technology and Innovation Indicators, 2023. https://doi.org/10.55835/644192caf38f9678c0feaff0.

[51]   W. Kaltenbrunner, T. Haven, A. Algra, R. Akse, F. Arici, Z. Bakk, J. Bandola, T.T. Chan, R. Costas Comesana, C. Coopmans, A. Csiszar, C. de Bordes, J. Dudek, M. Etheridge, K. Gossink- Melenhorst, J. Hamann, B. Hammarfelt, M. Hoffmann, Z. Ipema, S. de Rijcke, A. Rushforth, S. Sapcariu, L. Simmonds, M. Strinzel, C. Tatum, I. van der Weijden, and P. Wouters. Narrative CVs: A new challenge and research agenda. Leiden Madtrics, 2023. Available from https://www.leidenmadtrics.nl/articles/narrative-cvs-a-new-challenge-and-research-agenda.

Editors

Kathryn Zeiler
Editor-in-Chief

Kathryn Zeiler
Handling Editor

Editorial Assessment

by Kathryn Zeiler

DOI: 10.70744/MetaROR.43.1.ea

The authors present a novel method—developed through a series of workshops—for assessing academics. The proposed reform aims to move evaluation away from the traditional focus on scholarship, teaching, and service and toward a more nuanced and flexible set of values. Two scholars reviewed the article. They note that the authors’ summary of their findings is comprehensive and that adding concreteness to the widely accepted but abstract reform idea is useful. The reviewers also offered suggestions for improvement. Both reviewers suggested adding details related to the methodology of moving from the workshop data to the proposal. The reviewers also agreed with the authors that implementing the framework might seem daunting and suggested ways to help potential adopters handle the complexities. The reviewers and I agree that the article is a valuable contribution to the literature related to assessing academics.

Competing interests: None.

Peer Review 1

Louise Bezuidenhout

DOI: 10.70744/MetaROR.43.1.rv1

I enjoyed reading this paper and seeing the further development of the framework for values-based assessment in academia. Overall, I feel that the authors provide a comprehensive overview of their workshop approach and findings. Minor suggestions would be including a little more information about the workshop structure and strategies within the text and not relying on reference 31 entirely. For example, instead of asking the reader to locate details regarding the methodology in a different paper I would suggest either adding an appendix with more detailed methodology or adding some broad demographic information and a bit more about recruitment of participants. I would also be interested in how differences of opinions and consensus building were navigated within the workshops. It is also possible that some kind of visual about the workshop structure might be useful for readers, including a bit more detail as to the topics discussed and how the audience was engaged.

In the suggestions for further use of the framework, it would be useful to have some reflection on what is needed for institutions and departments to make use of it. As the authors point out, “our framework could seem overwhelming, with 14 values, multiple behaviors under each, and the potential to add more”. Providing more reflection on this would be helpful, particularly around instigating value-focused discussions and mediating consensus building.

In relation to the future work, I would be very interested in whether the authors were expanding their work to include data stewards and other key support staff communities. In the Netherlands there is a strong push to decrease the distinction between academic and academic support staff. In particular, this is in recognition of their expertise and the key roles that they play in successful research – and Open Science. I think that it would be useful to include perspectives from these communities as well in further iterations of the framework.

Competing interests: None.

Peer Review 2

Ruth Schmidt

DOI: 10.70744/MetaROR.43.1.rv2

The argument presented by McKiernan et al. reminds us that while research assessment reform is increasingly the subject of discussion and proposals for the need to change, concreteness around how to convert this widespread agreement into action remains somewhat lacking. Developed with the input of a series of workshops, the authors present a values-based framework with the hope that this will help institutions and individuals across diverse disciplines take ground and move the needle to achieve actionable change.

This paper offers a valuable perspective and well-resourced recommendations that pull from both published works and on-the-ground insights to address a recognized challenge; as such, the suggestions below are largely focused on potential challenges to adoption, in the interest of driving uptake and to increase the chances the framework can generate individual and institutional benefit.

Purely from a usability standpoint, 14 values is a lot to process (as the authors themselves recognize). While the article makes it clear that there is no expectation to use all 14, with encouragement for “specific departments, disciplines, and institutions to work through whether these values are the ones that resonate most with them,” the sheer number does risk overwhelming potential users from the jump or inadvertently scaring away folks who may feel paralyzed by the need to winnow them down. This, in combination with facing a wholly new system that is intentionally designed to reduce the security blanket of research/teaching/service, may feel like too many changes at once, which subsequently risks reducing potential uptake. Given that one common refrain from those starting on the research assessment reform journey is that simply figuring out how to start can be a challenge, there may therefore be a strong benefit to providing conceptual on-ramps to make this framework as approachable as possible This could take a range of forms—e.g., via clustering values into higher-level categories; employing structures or prompts to assist with processing options or selecting a place to start; proposing potential prioritization strategies—that would neither limit the content nor enforce a strict regimen, but which might make initial entry less daunting. Note that this is not a suggestion to reduce the set, as all seem useful and relevant, but simply to provide some strategies to make the task feel more initially manageable in order to overcome early potential barriers to adoption.

One known and recurrent challenge with more qualitative assessments can be that they are often seen as more time-consuming than using quantitative measures, which tend to be much easier to scan, digest, and compare (see e.g., Ma, L. (2021) ‘Metrics as Time-Saving Devices’, in: F. Vostal (ed.) Inquiring into Academic Timescapes. pp:123-133. Emerald Publishing Limited. 10.1108/978-1-78973-911-420211011 and Rushforth, A. & De Rijcke, S. (2024) Practicing responsible research assessment: Qualitative study of faculty hiring, promotion, and tenure assessments in the United States, Research Evaluation 33 rvae007, https://doi.org/10.1093/reseval/rvae007. A second—perhaps less articulated, but equally critical—issue is that reviewers may not always feel equipped or trained to assess more qualitative outputs. This makes the inclusion of example activities and indicators extremely valuable in their ability to provide a useful on-ramp to assessment activities. At the same time, the authors note that values can be interpreted broadly; this is both a blessing (in that they can accommodate a wide potential range of instances) and a curse (in that it may more difficult for new users to feel confident in how they are being applied), especially as the indicators essentially read as slightly more detailed versions of the activities or behaviors. While I recognize this is intentional, to support a wider variety of potential use cases, it may be useful to explicitly prompt potential users to consider how moving beyond the general can supply an extra layer of specificity appropriate to the case at hand, which in turn can help concretize what ‘good’ looks like in their specific instances. This may be especially useful or important in fields where value may be more qualitative or more difficult to capture and in situations like tenure processes, which rely on communicating or translating accomplishments across disciplines.

The authors make it clear that this approach should not be seen just as an alternative set of measures for what has been used historically, but as an opportunity to interrogate the notion that there is a single way to signal or demonstrate scholarly success, and further to reinforce that there is not one correct pathway or model for building an academic career. This strikes me as an important point, and one that perhaps deserves more attention. It is well recognized, for example, that the traditional hierarchy often used in academic assessment (i.e., research > teaching > service) creates a perverse set of incentives, where some activities are rewarded and legitimized more than others. Despite the fact that the paper mentions that responses during the working sessions were open-minded about contextualizing these values across different scholarly activities, the longevity of that mental model may make it hard to dislodge, and may result in individuals using the proposed framework in such a way that values seen as being more research-aligned are prioritized, or used in ways that continue to promote research above teaching (e.g., mentoring), and so on. This suggests the value of reiterating how the framework’s values can show up across the traditional triad of research/teaching/service for those who might be new to the idea.

Secondly, the valuable insight that different disciplines will likely reflect these values in different ways suggests that the framework might benefit from further consideration regarding how this proposed approach might play across different career stages (e.g., early career vs. advanced professionals) or career paths (e.g., alternatives to the traditional arc that presumes moving in a linear, unbroken progression from undergrad to graduate to post-graduate academic positions). With regard to career stages, while many of the values included in this model are surely important at any point along a career trajectory, the ways in which they manifest may vary quite a bit (most specifically, for example, in cases like leadership and mentorship that may naturally take on a different tenor as one advances; qualities such as “collaboration and partnership” or criteria for “advancing knowledge” may also look quite different with increased seniority). This suggests there is potential value in proposing that institutions consider how each dimension might take on natural progressions in behaviors and indicators.

Further, it might be interesting to consider how the values themselves might provide an inspirational frame or structure that helps academics—perhaps especially early in their career—see how focusing on constellations of values can help them envision and carve out a scholarly identity.  This suggestion is prompted in part by the notion of trying on different ‘shapes’ of academic identities that appears in Building Blocks for Impact (https://doi.org/10.5281/zenodo.7249187), which took a different tack toward expanding what matters with regard to scholarship but was equally interested in helping scholars envision centers of gravity and trajectories that were not solely grounded in traditional milestones such as tenure. [In full disclosure, I worked on this model as a part of a DORA-led, grant-funded effort called Project TARA; this point is offered not as a bid for a citation or as self-promotion but because that model grew from similar motivations as the ideas presented in this paper].

Recognizing movement from or into careers outside of academia or developed in practice-based settings may also help the framework encompass the reality of non-traditional career arcs (e.g., moving to or from industry). Given that the framework is already quite substantial, this is not as a suggestion to add more layers of content; rather—as with many of the suggestions above—this might entail supplementing the framework with guidance to prompt institutions or committees to consider a different kind of scholarly diversity. Along similar lines, there might be a benefit in recognizing how career variability does not change the values themselves, but that it might impact how those values play out. For example, moving back and forth from industry (or other non-academic arenas) to academia might offer new types of relationships and opportunities but might also limit open sources sharing to some degree if one is constrained by institutional requirements such as non-disclosure agreements (NDAs).

Finally, it might be worth reflecting on potential ways to capture feedback about the framework’s use or cases that describe how the framework was employed or implemented. While that data collection effort obviously expands beyond the framework itself, the authors’ position that the model is a starting point suggests that proposing ways to learn from collective use will not only help provide guidance for others but also ensure that the framework and overall approach gets better and more robust over time.

Competing interests: None.

Leave a comment