User Tools

Site Tools


Reading Robert K Merton: Evaluation in Science

date: 2013-10-28 11:09

Some notes on reading “Institutionalized Patterns of Evaluation in Science” in The Sociology of Science: Theoretical and Empirical Investigations 1) by Robert K. Merton.

Note: The referee's role in the institution of science:

The referee is thus an example of status judges who are charged with evaluating the role-performance in a social system (p. 460).

And the relationship between status judges and social control:

Status judges are integral to any system of social control through their evaluation of role-performance and their allocation of rewards for that performance (p. 460).

Other status judges or judgments include:

  • book reviewers
  • review articles

Reflection: Would blogs about scientific reports and the like function in this way today? Perhaps this is either what things like altmetrics and similar proposals suggest. What they need some research to show that this is the case, that alternate publishing can function in this way.

And, quoting footnote 1:

1. J. M. Ziman, Public Knowledge: The Social Dimension of Science (Cambridge: At the University Press, 1966), p. 148

Merton writes:

“The referee is the lynchpin about which the whole business of Science is pivoted” (p. 461).

So much power in this role. This is why there is so much at stake and, also, the debate over peer review can become political.

And then Merton writes about what referees are tasked to do:

The chemist, Professor Leonard K. Nash, describes the “editors and referees of scientific journals” as “the main defenders of scientific 'good taste'” [emphasis added] (p. 461).

Here he is quoting from:

2. Leonard K. Nash, The Nature of the Natural Sciences (Boston: Little, Brown, and Co., 1963), p. 305.

Note: Merton gives himself four tasks in this piece:

  1. Survey history of peer review
  2. Explore implications of rates of rejection
  3. Analyze peer review decision making
  4. Discuss the significance of the issue

Historical overview

Note: The early organizations:

These organizations provided the structure of authority which transformed the mere printing of scientific work into its publication (p. 462).

Note the difference (and Merton's emphases) between printing and publishing. Of course, the digital version of printing today would be, blogging.

This is also centered around a “structure of authority” of organizations. Some of the recent scholarly communication trends seem to suggest crowd sourcing this authority. It's certainly a more democratic notion, but is it better?

Note: The origin of peer review stem with the Royal Society's desire to control what it published in its new journal, Philosophical Transactions.

What is perhaps most significant here is that the council, as sponsor of the Transactions, was involved with its fate and wanted to have a measure of control over its contents. These adaptive decisions provided the basis for the referee system (p. 463).

Authority thus begins with a governing play.

Note: See footnote 6 on bottom of page 463 for more historical reading. I.e.: J. R. Porter.

Note: The role of printing:

With the advent of printing, however, findings could be permanently secured, errors in the transmission of precise knowledge greatly reduced, and intellectual property rights registered in print. Printing thus provided a technological basis for the emergence of that component of the ethos of science which has been described as “communism”: the norm which prescribes the open communication of findings to other scientists and correlatively proscribing secrecy (p. 464).

Does digital printing provide a technological basis for the emergence of a new ethos? Or simply the amplification of the communism component of the longstanding ethos?

Merton continues:

But it appears that this norm did not fully develop in response to the new technology of printing; ancillary institutional inventions served to facilitate the shift from motivated secrecy to motivated public disclosure (p. 464).

Reflection: I see that Merton cites Elizabeth Eisenstein's work in the history of printing. When I submitted my dissertation for review, it was suggested that my citing of Eisenstein was a bit of a stretch (similar topic to this piece). Perhaps I didn't make the connection well enough in my work, but Merton's use of Eisenstein does validate my use of her in a similar vein.

Device 1

One such invention, Merton notes, is the “institutional device” that incents public disclosure is time stamping received manuscripts in order to show “priority of discovery”:

The Society would officially establish priority of discovery by recording the date on which communications were first received (p. 464).

But today there seem to be additional ways to time stamp a communication: blogging. But as I've noted earlier, without organizational backup and with technological shiftiness inherent in a moving web, this may not be a very trustworthy device.

That is, in a digital landscape, dating can be fudged. Third party action is necessary in order to validate the dating. Also, the action has a goal: “to facilitate the shift …” (p. 464) that is related to the communism component of the ethos of science.

Device 2

Note: The second institutional device used to facilitate communism is “prompt printing” (p. 465). This would help protect against “'philosophical robbery'” (p. 465).

Device 3

Note: The third is the role of the journal to archive science:

Boyle did report, however, another motive for contributing to the newly invented journal. Almost in so many words, he saw this as a way for the scientist to have his work permanently secured in the archives of science…“ (p. 466).

And the end of same paragraph:

In an case, we find in Boyle's remarks an early intimation of the scientific journal as a scientific archive.

The journal, it should be noted, is not just for communicating to present day peers but also to a future audience. (This has relevance to my LQ Reading Study.

But also, the need to refer to the past is an important criteria for any use and assessment of social media to communicate science. Not only should that past be safeguarded (and unchanging) but it must also be accessible. Today, social media and digital publishing/printing do not guarantee this kind of archiving (although Internet Archive does provide some protection).

Device 4

Note: Property rights. And there it is.

Another motive could be harnessed to the developing innovation of a scientific journal. Property rights in discovery were sought after by scientists primarily as individuals but occasionally also as nationals (p. 466).

This is an important consideration today.

Note: And these *devices*, and others, work together to not only create scientific communication but to create science itself.

Together these and kindred institutional devices, the new scientific society and the new scientific journal persuaded men of science to replace their attachment to secrecy and limited forms of communication with a willingness to disclose their newly found knowledge. But institutionalization is more than a matter of changing values; it also involves their incorporation into authoritatively defined roles (p. 467).

But it is these devices that are under the microscope today. What does that mean for the future of science when the future of the communication of science will be altogether different? The journal was the channel for this to take place, and an important component of that was that there was an authority behind the journal. What may be?

And so:

They repeatedly express an awareness that to retain the confidence of scientists they must arrange for the critical sifting of materials which in effect carry the imprimatur of the Society (p. 468).


In the course of establishing its legitimacy as an authoritative scientific body, the Royal Society was gradually developing both norms and social arrangements for authenticating the substance of scientific work (p. 468).

Perhaps, then, was is being examined or challenged today is not the norms but the social arrangements of authentication?

The legitimacy of a journal's authority should include its faults.

From the beginning, interestingly, branding was an essential part of the authoritative role of effective and good science. This highlights an underlying incentive to ensure a quality brand. Thus, it seems to be that any so-called impact metric (either of the journal or the author) should be less a function of a count of citations or even mentions but more a function of the costs incurred as a result of bad scientific behavior. For example, retracting a paper should act as a cost to the journal for allowing the paper to be published. This requires a lot more thought, but the actual cost of the journal, either in the toll sense or the gold sense with the APC, should take that hit too. The goal here: what's a fair price?

In essence, journals have an incentive to protect their brand (their imprimatur), so any calculation of impact should include a hit to that value (a cost). The cost should be “heavy,” of course, but not game-able. This is a very unsophisticated first line of thought, but something along these lines should provide a greater incentive to maintain scientific rigor.

Note: The editor:

Through the emergence of the role of editor and the incipient arrangements for having manuscripts assessed by others in addition to the editor, the journal gave a more institutionalized form for the application of standards of scientific work (p. 469).

## Implications of rate of rejection

An important reason Merton notes for rejection – it's not simply quality, but shared knowledge of what constitutes good science within the norms of a field of study:

This suggests that these fields of learning are not greatly institutionalized in the reasonably precise sense that editors and referees on the one side and would-be contributors on the other almost always share norms of what constitutes adequate scholarship (p. 472).

For example, PeerJ's editorial criteria 2) (“soundness of the science, not its importance”), is an attempt to be explicit about what “constitutes adequate scholarship.” That is why that stated criteria is so important – it is an attempt, good or ill, to influence what scholarship should look like.

It should also be noted that, following Merton's line of reasoning, high rejection rates will also signify a disequilibrium between what editors believe is adequate scholarship and what contributors believe is adequate scholarship. If this were to happen in a publication platform such as PeerJ, it would easier to identify this as a disequilibrium because PeerJ does not have the space limitations of a print journal or of an online journal that operates like a print journal. Having multiple editorships as part of its process is also an essential function for this to be so.

Also: a high rejection rate is an essential characteristic of a good quality print-mode journal because the rejection rate will be a function of space in the journal. This will hold true for any digital issue of a journal, even if the issue is completely online, if it continues to operate in print-mode (e.g., issue based).

An acceptance/rejection rate will also be a function of how many manuscripts are submitted. The more submissions the higher the rejection rate given the space limitations of a print or print-mode online journal.

In light of this, it should be noted that PeerJ is aiming for an acceptance rate of around 70% 3)

Note: Kinds of peer review errors:

There are intimations in the data also that the editors and referees of journals with markedly different rates of rejection tend to adopt different decision-rules and so are subject, when errors of judgment occur, to different kinds of error (p. 473).

Here's his summary of the difference (bottom of paragraph):

Put in terms reminiscent of another institutional sphere, the decision-rule in high-rejection journals seems to be: when in doubt, reject; in low-rejection journals, when in doubt, accept (p. 473).

And this is because, for low-rejection journals (middle of paragraph):

The editorial staff of low-rejection journals, where external evidence suggests that the decisions of scientists to submit papers are based on standards widely shared in the field, apparently prefer to risk errors, if errors there must be, of the second kind: occasionally to publish papers that do not measure up rather than to overlook work that may turn out to be original and significant (p. 473).

Note: Merton takes note of print space and scarcity as they relate to cost on pages 474-475.

Peer review decision making

Note: In an interesting comment about variability, author count per article, and acceptance rate in a study Merton is about to describe:

Substantively, it turns out that papers with more than one author, largely reporting experimental results, have so high an acceptance rate (over 95 percent) that they can exhibit little variability in evaluations of the kind we want to investigate (p. 478).

Then he writes:

Procedurally, it is the case that the rank of the single author can be unambiguously and realistically identified. But not so in the case of papers by several hands, with their varying numbers of authors, often of differing rank (p. 478).

Note: Beginning to investigate why a disequilibrium in the publication output between eminent and not eminent scientists:

How does it happen that physicists from the minor departments who are submitting almost as many single-author papers as their counterparts in the major departments end up by having significantly fewer of them published? The question is critical because the gross empirical finding lends itself to sharply different kinds of interpretation (p. 480).

One possible reason:

Related to the stratification system. Eminent scientists get more leeway because of their reputation.

Second possible reason:

Eminent scientists produce better manuscripts because they do better science.

Neither reason excludes the other. That is, both may explain the varying publication success between both stratification levels.

Note: Models of referee allocation:

  • ”'oligarchical'“ model
  • ”'populist'“ model
  • ”'egalitarian'“ model
  • “expertise” model (p. 484)

And then, based on his study:

This suggests, although it does not demonstrate, that expertise and competence were the principal criteria adopted in matching papers and referees (p. 485).

Note: Relationship between author and reviewer:

Perhaps it is not the status of the author as such but his status relative to that of the referee which systematically influences appraisals of his manuscripts. Such biases in judgement might take various forms, depending on the pattern of relative status (p. 489).

Types of relationships:

  • “When referees and authors are status peers, an hypothesis of status solidarity would have it that referees typically give preferential treatment to manuscripts, as as a …
  • “counter-hypothesis of *status competition* would have it that under the safeguard of anonymity, referees tend to undercut their rivals by unjustifiably severe judgements” (p. 489).
  • “When the authors outrank referees, an hypothesis of *status deference* would hold that the referees give preferential treatment to the work of physicists they respect or hold in awe …
  • “just as a counter-example of *status envy* would have them be more exacting of the work of superiors” (p. 489-490).
  • “And when referees outrank authors, an hypothesis of status patronage or sponsorship would maintain that referees are unduly kind and undemanding …
  • “while a counter-hypothesis of status subordination would have them overly demanding” (p. 490).

And then:

Differing in other respects, these six hypotheses are alike in one: they all assume that the relative status of referee and author significantly biases judgements by referees, either in favor of the author or at his expense. More concretely, all assume that the rates of acceptance for each stratum of authors will differ according to the rank of the referees making the judgements (p. 490).

All six hypotheses were rejected based on his study of Physical Review (just that one journal).


Note: Such optimism then, but it seems like everyone is claiming the sky is falling today:

With all its imperfections, old and new, the developing institution of the referee system provides for a warranted faith that what appears in the archives of science can generally be relied upon (p. 494).

Here's a key point–the reader is the ultimate judge, but not just any reader, a scientist (or any scholar—in particular a scholar that publishes and thus cites):

When a scientist is working on a problem treated in a published article, he can serve as his own referee. He may, in fact, be better qualified to assess its worth than the official referee who helped usher it into print (p. 494).

He then writes, and this is extremely important:

It is no so much the fellow specialist as the others making use of published results in fields tangential to their own who particularly depend on the referee system (p. 494).
  • categories:
    • lit review
    • peer review
    • peerj
    • research
    • merton
blog/reading-robert-k-merton-evaluation-in-science.txt · Last modified: 2017/03/07 09:07 by seanburns