TOS banner

Frequently Asked Questions

Teacher Observation Scale (TOS)™

General Questions

What is the Teacher Observation Scale (TOS) ™?

The Teacher Observation Scale (TOS)™ is an instrument used for the objective analysis and appraisal of teacher classroom performance, created specifically for teachers by teachers. It uses a format that represents a major departure from common methods of teacher evaluation. The TOS employs a behavioral checklist, consisting of a variety of statements describing relevant aspects - both positive and negative - of teacher classroom performance. The observer completing the TOS indicates on the scale which statements reflect the observed teacher’s performance, and which do not. This information is then analyzed against predetermined weights or calibrations (not shown to the rater) made by an independent panel of experienced educators, to yield a comprehensive, objective assessment of the teacher’s classroom performance.

What can the Teacher Observation Scale (TOS) ™ do for us?

The Teacher Observation Scale (TOS) ™ represents a significant advancement in the assessment and appraisal of class instruction. It minimizes the chances of unfair, ambiguous, subjective, and distorted appraisals that can often occur when using traditional teacher evaluation methods. The Teacher Observation Scale (TOS) ™ provides a concrete assessment of teacher classroom performance independent of the personal biases or idiosyncrasies of the observer, and independent of the academic performance of students.

How does the Teacher Observation Scale (TOS) ™ work?

The TOS contains 50 statements describing specific aspects of classroom instructional activity. Examples of some of these statements include:

The teacher's questions required the students to apply the concepts of the lesson.quotation marks
The teacher assisted the students in formulating the general concept by analyzing specific data.quotation marks
The teacher failed to use a motivation to start the lesson.quotation marks

The rater observes the classroom performance of the teacher, and indicates whether or not the behavior described in each item was exhibited, using the following scale:

  • A) The teacher performed this behavior
  • B) The teacher did not perform this behavior when given the opportunity to do so
  • C) The teacher did not have the opportunity to perform this behavior

The completed TOS surveys are sent to National Measurement and Testing (NMT) for analysis, and are scored against the predetermined ratings of an independent panel of experienced educators to yield a comprehensive assessment of the observed teacher’s performance.

For what types of situations can the Teacher Observation Scale (TOS) ™ be used?

The Teacher Observation Scale (TOS) ™ is designed to be used during regular observations of teacher classroom performance. It can also be used in the appraisal of job applicants performing a trial lesson, and for the evaluation of student teachers.

What are the content areas or dimensions of teacher classroom performance that the TOS measures?

The items in the Teacher Observation Scale (TOS) ™ scale address the following aspects of teacher performance: Use of a motivation, elicitation of aims, development of the lesson, formulation of questions and dealing with answers, use of teaching aids (board, technology, demonstrations), development of lesson content and summaries, development of content applications, encouragement of student involvement and student-to-student exchange, establishment of rapport, management of the class, maintenance of student interest, demonstration of scholarship, display of poise and vitality, and demonstration of good voice quality and pronunciation.

Are scores on the Teacher Observation Scale (TOS) ™ based on student performance?

No. The Teacher Observation Scale (TOS) ™ is an instrument used solely for the assessment of teacher classroom activities. The scores on the Teacher Observation Scale (TOS) ™ are not in any way linked to, correlated with, or paired with either individual or collective student academic performance. There are numerous factors besides teacher performance (e.g. study habits, school environment, educational resources, parental influence) that relate to student achievement. For this reason, student performance is in no way used in the determination of scores on the Teacher Observation Scale (TOS) ™. Only activities and behaviors within the teacher’s control are described within the TOS items.

Can the Teacher Observation Scale (TOS) ™ be used to assess teacher activities outside the classroom?

No. The content of the Teacher Observation Scale (TOS) ™ is exclusively geared toward the assessment of teacher classroom activities. Tutoring, grading, parent consultations, administrative duties, and other tasks performed outside of classroom instruction are not addressed by the TOS, and therefore it would be inappropriate to use this instrument to evaluate such activities.

Is the Teacher Observation Scale (TOS) ™ meant to be used by students?

No. The content of the Teacher Observation Scale (TOS) ™ assumes the rater has a thorough knowledge of the principles of teaching that only a trained and experienced educator would possess. It would be inappropriate to use the TOS as an instrument for student evaluations of teacher performance.

What advantages does the Teacher Observation Scale (TOS) ™ have over other observation methods and rating systems?

Instruments used to evaluate teacher classroom performance will often employ “rating scales” that present the observer with a list of traits or characteristics on which the teacher is to be evaluated. These characteristics are typically rated along a range of numbers, adjectives, or descriptions representing different levels or degrees of performance (e.g., “superior”, “very good”, “fair”, “poor”, “and satisfactory”). Rating scales of this type are prone to numerous sources of distortion and error, often leading to invalid and inaccurate appraisals. Such scales may not provide sufficient time to evaluate the observed teacher on all traits listed, leading to superficial judgments. Some of the traits or characteristics may not be clearly defined, and could mean different things to different raters (e.g., “Ability to maintain student interest”). Scales of this nature are often vague and ambiguous as to the distinctions between various levels of performance. Some are unclear in terms of how total scores are to be determined, and are subject to the manipulations of an observer who has decided beforehand to give a teacher a certain overall rating without considering all of the elements of performance. And finally, many such scales will be prone to rater tendency to disregard specific aspects of performance in favor of general impressions, a problem referred to as “halo error”.

Teacher Observation Scale (TOS)minimizes or eliminates these sources of error by avoiding vague trait descriptions and ambiguous rating values. Instead, the TOS employs a list of statements describing typical classroom activities. The observer indicates which of these activities are exhibited by the teacher giving the lesson and which are not, without any superficial ratings of degree. This information is then analyzed against predetermined calibrations, to produce an objective and fair appraisal of teacher performance. Another distinct advantage of the Teacher Observation Scale (TOS) ™ is the overall score, which provides a standardized, universal yardstick of  teacher classroom performance, allowing for meaningful comparisons of performance between teachers, or teaching job candidates.

How was the Teacher Observation Scale (TOS) ™ constructed?

The Teacher Observation Scale (TOS) ™ items consists of statements that were written by educators with many years of experience in observing other teachers, and were drawn from or based upon typical reports of teacher performance. Each of these statements were then were reviewed by a nationwide panel of approximately 260 experienced educators, sampled from public and private schools, at the elementary, middle school/junior high, and high school levels. The panel rated each statement as to whether the behavior described constituted good or poor teaching performance, and to what degree. These ratings were analyzed using a psychometric method known as Rasch scaling to produce calibrations or “weights” along a continuum of good versus poor teacher performance.

A technical report detailing the various properties of the TOS instrument is available upon request. Contact National Measurement and Testing for more details.

How does the Teacher Observation Scale (TOS) ™ maximize objectivity?

An important aspect of the TOS is that the observer is not informed of the scale values for the items used in the determination of the candidate’s score. The observer only reports on what he/she sees the teacher under observation do or fail to do. The observer does not calculate any scores. Instead, the completed scale forms are returned to National Measurement and Testing, which scores the instrument based on the pre-determined calibrations for each item. In this manner, the person conducting the observation cannot manipulate or distort an observed teacher’s overall score on the TOS. Thus, any potential rater bias in TOS scoring is greatly minimized, if not eliminated.

Back to the top

Administering the TOS

For how long should the teacher be observed?

The content and scoring of the Teacher Observation Scale (TOS) ™ assumes that the teacher under observation had adequate opportunity to (among other things), use a motivation, develop the lesson, employ demonstrations, pose questions, deal with answers, and summarize the lesson. Therefore, in order to make a fair and valid appraisal, the rater must conduct the observation long enough to see the teacher perform - or at least have sufficient opportunity perform - all of the functions measured in the scale. Usually, this will entail the observer remaining in the classroom for the entire class period, which can range from around 40 minutes to over 60 minutes, depending upon the school. However, for the purposes of the TOS, if the teacher appears to have completed the lesson and covered all the major functions addressed in the TOS before the class period actually expires, it may be possible to complete the observation at that point.

Should the teacher be allowed to see the content of the Teacher Observation Scale (TOS) ™ before actually being observed?

There are two points of view on this matter. Revealing the TOS content beforehand to the teacher to be observed could impel that teacher to alter or direct his/her usual classroom performance merely to satisfy the perceived standards of the scale. On the other hand, it could be argued that if the teacher is performing in a manner that is consistent with what the scale delineates as good instructional skills, he/she is by definition performing well as a teacher, which of course is the primary objective of class observation. From a practical standpoint, the length and comprehensiveness of the Teacher Observation Scale (TOS) ™ make it unlikely that an instructor with chronically weak teaching skills will be able to “fake” performance in order to attain a high rating. Also, as noted above, both the observer and the teacher being observed are not informed of the scale values for the items used to determine the teacher’s score. Thus, even if the teacher to be observed was shown the content of the TOS, it would be difficult, if not impossible for that teacher to manipulate performance in order to artificially increase his/her score.

Are alternate forms of the Teacher Observation Scale (TOS)™ available?

Yes. We currently provide two alternate forms of the Teacher Observation Scale (TOS) ™: Form A and Form B. These forms cover the same dimensions of performance with statistical equivalence, and will therefore yield comparable results.

Back to the top


How do we prepare the Teacher Observation Scale (TOS) ™ survey forms for scoring?

The TOS surveys are mailed to National Measurement and Testing (NMT) along with a completed Analysis Order Form, which is enclosed within each packet of TOS survey forms.

How are the scores on the Teacher Observation Scale (TOS) ™ reported?

Upon receipt of the completed survey forms and Analysis Order Form, NMT analyzes the results and produces individual score reports for the submitted surveys. Two copies of each score report are provided: one for the teacher, and one for the school’s records (the information appearing in both copies of the score report is exactly the same). The report includes an overall score of the observed teacher’s performance, expressed on a scale ranging from 100 to 700, with a passing score of 400. Schools are permitted to employ their own passing scores if desired. The report indicates the number and percentage of items on which the score is based.

Note that if the score is based on fewer than 25 items, the scores may not be reliable. The report also provides subscale information, indicating the areas in which the teacher performed well, and which areas need improvement.

When do we receive the results?

Regular Service processing time is within seven business days of the receipt of payment or receipt of the survey forms - whichever is later. Rush Service items are processed upon receipt of payment or receipt of the survey forms (whichever is later), and are mailed via overnight service no later than 1 business day after scanning and processing. Scores cannot be released until payment is received.

Back to the top

Technical Questions

What is the validity evidence for the Teacher Observation Scale (TOS) ™?

The validity of a scale refers to the extent to which it measures what it is supposed to measure. Validity is generally considered the most critical aspect of a measurement, for if a scale is not valid for its intended purpose, its usage in that context is inappropriate.

The evidence for the validity of the Teacher Observation Scale (TOS) ™ stems from the procedures used to determine the scale’s content. A group of subject matter experts, comprised of experienced teachers and educational administrators, determined the primary aspects or dimensions of classroom teaching. This committee identified a total 22 content categories or aspects of class instruction that the scale should address. These categories were then weighted for relative importance; the weights were used to determine the distribution of items on the scale. The content categories and the numbers of items associated with each of the categories are listed here.

The scale items were written by subject matter experts and designed to measure each of the content categories. Each item describes a specific teacher activity related to one of the scale categories, written in a manner that can be applied to any lesson. The items were then rated for relative importance by 270 experienced educators sampled from across the country. These ratings of importance serve as the calibrations or “weights” for the TOS scale items.

What is the reliability of the Teacher Observation Scale (TOS) ™?

The reliability of a scale refers to the accuracy of its measurement, or the extent of agreement between scores obtained from two or more administrations of the scale to the same subjects. Like validity, reliability is a critical aspect of a measurement that must be considered in interpreting scores, as well as differences between scores.

The estimated Cronbach Alpha (KR-20) reliability for the Teacher Observation Scale (TOS) ™ is .86. This figure is an approximation, as the scale allows for the omission of items deemed unrelated to the lesson by the rater, and omitted items are not used in the determination of scores. Note that if fewer than 30 TOS items are rated, the score is considered unreliable, and under such circumstances, should be interpreted with caution. This information will be indicated on the TOS score report in the form of a caveat in the event that fewer than 30 items are rated.

How were the standards of performance (i.e., “pass versus fail”) determined for the Teacher Observation Scale (TOS) ™?

The passing standard for the Teacher Observation Scale (TOS) ™ was determined by a panel of experienced educators, who indicated which of the behaviors described in the scale were likely to be exhibited by a minimally competent (or “borderline”) teacher while giving a typical lesson. These judgments were collected and averaged for the panel to produce a passing score. Further information on the standard setting procedure used for the TOS scale can obtained from the technical report available from NMT.

Can we employ our own standards of performance on the Teacher Observation Scale (TOS) ™, rather than those determined by National Measurement and Testing?

Yes. If you wish to employ your own standards for passing and failing, you may indicate this on the Analysis Order Form. NMT will score your results according to the revised standard.
However, in light of the sensitivity of such information, it is highly recommended that you employ revised standards only under careful consideration and with adequate justification. Please be advised that NMT will in no way defend or justify revisions to the TOS cut scores – the parties employing revised standards do so entirely at their own risk.

Can we employ theTeacher Observation Scale (TOS) ™ merely to compare and analyze performance, without using any “pass versus fail” standard?

Yes. You can specify on the Analysis Order Form that you do not want any pass/fail analysis performed on the scores. NMT will accordingly provide you with the overall and subscale scores without any pass/fail rating.

Back to the top

Additional Inquiries

Where can we get more information?

Questions about the Teacher Observation Scale (TOS) ™ can be directed to National Measurement and Testing (24/7) toll-free at 866-724-9600.

A list of popular questions about Teacher Observation Scale