David C Berliner, Regents’ Professor of Education Emeritus, Arizona State University, USA

For many teachers, research in education has a bad name. There are at least two reasons for this. Firstly, it doesn’t replicate well from school site to school site. Site variation is inevitably enormous, and thus research findings are affected by factors such as the experience level of teachers, the school’s demographic and the quality of leadership at the site. A research finding of potential import based on work at a particular school site may show smaller effects or not work at all at another school. The second reason research in education has a bad reputation is because it is hard for teachers and administrators to understand, filled as it often is with measurement formulas and statistics.

This arcane language of researchers can render the useful and interesting parts of a research study indecipherable. In journals, researchers frequently report ‘significant findings’. But the term ‘significant’ is related specifically to the statistics used. It means that under the conditions that a study was run, and for a given sample size, the findings favouring one group or one technique over another should be taken as confirmed. The findings appear not to be chance. But the actual difference between an experimental group and a control group on an achievement test, for example, may have been quite small. Thus, when trying out the research in the classroom – say teaching fractions in a certain way – the effects that are expected may not be easily noticed in a class of 25, although they appeared ‘statistically significant’ in a study with 250 students.

But these are not reasons to ignore educational research. They are simply reasons not to expect automatic replications or large effects when research studies are decoded and put into practice. The results and application of research findings from journals need to be filtered through teachers’ and administrators’ experience and their understanding of their students, community and curriculum.

Designing a programme to support teachers’ use of research evidence

We have sought to design professional development activities that use research that might change classroom practice and might affect student outcomes in positive ways. And even if our professional development programme did not affect practice or educational outcomes, it might still be worth teachers’ engagement in these activities for the sense of competence and community that our programme develops.

In designing our programme, we took seriously William James’s insightful statement (James, 1983, p. 15):

“you make a great, a very great mistake, if you think that psychology, being the science of the mind’s laws, is something from which you can deduce definite programmes and schemes and methods of instruction for immediate school-room use. Psychology is a science, and teaching is an art; and sciences never generate arts directly out of themselves. An intermediate inventive mind must make that application, by using its originality.”

We wanted to help teachers to more frequently be these ‘intermediate inventive minds.’ We also thought about Dewey’s concern for teachers. He too was sceptical of the direct connection between research and practice. He noted in a major speech (1900) that those, like himself, who were promoting a scientific approach to education needed to recognise that teachers lived in a concrete social world, and were clearly not inhabitants of a scientific world, dependent as it often is on abstraction. Under such conditions, different world views about how to solve problems should be expected.

We think we found a way to honour the thoughts of both James and Dewey and promote those ‘intermediate inventive minds’. Our method requires three things: a small group of teachers who want to study a problem, an interpreter of research findings, and a place at which to meet.

We started interested teachers off with an introduction to the class size controversy – every teachers’ concern – and helped them to understand the researchers’ coded language (Casanova et al., 1991). Simultaneously, we asked teachers at a school site about what else concerned them. From their concerns we picked topics for which we thought there was a research base. We would then search for articles addressing the issues they thought important, picking, say, 10–15 such articles. Then we would reprint the article on the left side of a wide page, and ask questions or make comments about the article in the additional space on the wide pages. Thus, research articles were annotated so that they could be read and understood by teachers. For example, if teachers wanted to know about cheating in high-stakes testing environments, we would put together a collection of well-designed research papers on that subject. One of those research papers had this paragraph (Amrein-Beardsley et al., 2010, p. 9):

“To measure the instrument’s levels of internal reliability, researchers collapsed the nineteen binary items into three constructs, as aligned with the taxonomy and defined in degrees of first, second, and third degree cheating. They categorized the first seven items in the first degree, the next seven items in the second degree, and the last five items in the third degree. Researchers then measured internal consistency reliability using Cronbach’s alpha. Each of the combined sections had a moderately high alpha coefficient (first = 0.70; second = 0.67; third = 0.83) indicating that the survey was reliable. The analysis resulted in alpha values close to or higher than the generally accepted 0.70 level (Cronbach, 1951). But it should also be noted that the instrument included only dichotomous, binary variables, which yielded lower bound, or underestimates of the instrument’s reliability.”

This is, of course, total nonsense to many classroom teachers. But in the extra space to the right of this paragraph from the article we would say things like:

“Researchers talk of their questionnaires as instruments. And they worry that their instruments will not yield the same results if they gave them more than once. They need instruments that will give pretty much the same results from one administration to another. That is, they seek instruments that are reliable. These researchers used a standard procedure for finding that out. It’s a statistical analysis called Cronbach’s alpha, which can be used to assess reliability. Alpha varies from 0 to 1.00. Here they found that each of the three sections of their questionnaire is reliable. The data obtained is trustworthy enough to interpret the findings that follow. Move to that section now.”

In another research report on the effects of different review procedures on test performance, the author said (Yu and Berliner, 1981, p. 11): “The experiment featured a 4 (encoding) x 2 (after-lecture review) x 2 (before-test review) between-subject design. Thus, the design yielded 16 independent cells, each of which ultimately contained six subjects that had been randomly assigned.”

On the right of the page, we often explained such technical descriptions this way:

“These comments are for other researchers so they can replicate the findings, should they want to do so. Science depends on replications to gain surety about their results. You can skip this and the next few technical paragraphs and move on to the next section of the report, where findings are reported. The experimental design used here is reasonable.”

So, a major requirement for this form of professional development is a translator of research reports on the topics that teachers want to know about. The other major requirement of this form of professional development is a small group of teachers (we recommend five to eight) that specify what they want to know more about – say homework policy – and are willing to meet regularly to discuss the research in that domain. The teachers we worked with used the translator-annotated research to interpret the articles in their areas of interest, using their teaching experience to make interpretations about the researchers’ data and conclusions, and judging the pertinence of the findings to their particular school and classroom situations.

The third requirement is a schedule to meet, and a place to do so. In our experience this has been at different teachers’ homes every other week over four to five sessions, or every other Friday after school in a nearby pub, until all the articles in a research area are discussed. We also requested that only teachers be invited to participate in these discussion groups, thinking that administrators could constrict the conversations. The conversations we monitored were often about family and work, which brings the teachers at a school site closer together. But the conversations were always about what actions might be taken given the new information they acquired (see Powell et al., 1992).

Our goal was to use teacher wisdom and experience to interpret the articles in research areas that concerned a range of teachers. We sought to present the best research in some areas in a comprehensible form to teachers, and to have teachers decide what was worth promoting and what was not. One of the teacher groups we worked with met over a few weeks on the topic of homework. They digested and debated the annotated research we provided, and then they went to their school board and convinced them to change the homework policy in their school district, based on research evidence. Another group of teachers worked with their school board on class size – basing local policy on the findings from research.

There was one other finding from our enquiries into this form of professional development: teachers enjoyed the process. They met, talked, traded jokes, all whilst working on educational issues. This form of professional development has the potential to improve both teachers’ personal and professional lives, at costs that are far less than most other forms of professional development.


Amrein-Beardsley A, Berliner DC and Rideau S (2010) Cheating in the first, second, and third degree: Educators’ responses to high-stakes testing. Educational Policy Analysis Archives 18(14).

Casanova U, Berliner DC, Placier P et al. (1991) Readings in educational research: Class size. Washington DC: National Education Association. Available at: www.nea.org/assets/docs/NEA-Readings-in-Educational Research.pdf (accessed 23 September 2019).

Dewey J (1900) The School and Society. Chicago: University of Chicago.

James W (1983) Talks to Teachers on Psychology and to Students on Some of Life’s Ideals. Cambridge, MA: Harvard University Press.

Powell JH, Berliner DC and Casanova U (1992) Empowerment through collegial study groups. Contemporary Education 63: 281–284.

Yu HK and Berliner DC (1981) Encoding and retrieval of information from a lecture. In: 65th American Educational Research Association Annual Meeting, Los Angeles, USA 13–17 April 1981.