In response to the rapidly changing field of education, PSA has just released v2.1 of its Management Practices Assessor. The Assessor is a key instrument that enables progress to be measured in the establishment of new practices in school leadership through structured training and coaching.
At the start of an assignment, PSA coaches conduct the assessment at every school which is part of the improvement initiative. The Assessor is structured according to themes which are weighted by the number of practices that they each cover – the Curriculum Management theme, for example, contains the greatest number of practices and hence is the most heavily weighted.
There are four levels to each practice – level 1 indicates the practice is not in place through to Level 4 which indicates that best practice has been achieved. After training has taken place, coaches support school leaders to implement the new practices they have learned with a view to achieving the highest level they can.
The number of themes in v2.1 Assessor have been increased from seven to nine partly to take into account the significant advances that have taken place in digital (Education 4.0) and the new world of work. While the structure of the new Assessor remains essentially the same – it is based on the DbEs Standards of Principalship - a key new element has been the introduction of core 21st-century skills including communication, creativity, collaboration and critical thinking..
The new themes are indicated in the graphic below:
During the course of an assessment, school leaders answer yes or no to a set of statements linked to a practice that builds upon each other to provide a maturity score for each theme. The data which is stored in PSA’s cloud-based server (AWS) can then be presented in multiple visualisations by individual school, circuit, district, etc, and most importantly each time a new assessment is conducted (typically once per year) it can be compared to the baseline in order to determine progress in practices (practice maturity).
As part of the process of developing the new v2.1 Assessor, PSA subjected the current older version Assessor to rigorous testing by a senior independent psychometrics specialist. This was done as part of PSA’s internal quality standards which are part of its ISO 9001 accreditation and in order to ensure that the new v2.1 Assessor was built on a very solid foundation. The last time the Assessor has been independently tested was in 2011 through the University of Cape Town.
Assessments of this nature focus on the validity and reliability of the instruments like PSA’s Assessor.
Validity is “the adequacy and appropriateness of interpretations and actions based on test scores” (Messick 1989) and the types of evidence typically sought will include data based on instrument content, response processes, internal structure, relationships with other variables and consequences of using the instrument.
Reliability addresses “the extent to which the results are free from measurement error” (Gronlund 1998) and speaks to the consistency of scores, which are obtained by the same individuals when they are requested to complete the assessment on different occasions (Anastasi & Urbina, 1997). Reliability is important, because unless results are stable one cannot expect the results to be valid. Internal consistency is a pre-requisite for construct (theme) validity, where one would expect a high item-total correlation since items (questions)measuring the same construct (theme) contribute to the total score of an instrument (Kline, 1993).
In the review of the older version of the Assessor, Rasch’s probabilistic model of measurement was used to empirically substantiate the validity argument i.e., Rasch was used to establish how well the data fitted with the expected statistical model of behaviour.
These were the results of the review:
Within the Rasch framework, fit statistics provide a quality control mechanism to establish how well the data conform to the Rasch model - if the data do not fit the model then questions around whether the same construct is measured are raised (Boone, 2016). Ideally fit statistics
should fall between 0.7 and 1.3 to be considered sufficient for measurement (Linacre, 2005). In this review mean squares (MNSQ) were used indicating a result of 0.96 - 1.02 for the Assessor thus falling well inside these parameters.
The person reliability index in Rasch analysis indicates the replicability of the order of persons (or schools) on the person-item map if this sample of persons (or schools) were giving a parallel set of items measuring the same construct. The item reliability indicates the replicability of the order of item if the same items were given to a different sample of respondents with the same number of respondents and that behaved in the same way. A high item reliability on a scale of 0 -1 indicates that the instrument has sufficient items along the pathway of easy to difficult and that we can expect consistency inferences about easy and difficult items. In the case of the Assessor item reliabilities of 0.84 and 0.85 were achieved indicating Good Reliability
The item-person map provides a visual representation of spread of items on one hand and person
ability (or school ability) on the other. Items included in the analysis and the schools participating on
the same scale. On the right-hand side of the continuum are the items, with the schools displayed on
the left. Ideally, the schools should form a standard normal curve, as one would expect schools of high and low ability to be at the ends, but the majority of the schools in the middle. Here the assessment found that there were 16 statements in the Assessor which did not conform to the normal curve.
Since the data that was provided for this analysis was for schools which had been through three years of intensive support, the conclusion was that the instrument did not differentiate sufficiently between those schools that reach Level 4. To improve this differentiation, it was recommended that either a 5th level be introduced or that a higher bar be set in level 4 statements to achieve the standard normal curve.
In response to this assessment, PSA has decided to stay with its 4 levels in its v2.1 Assessor but it has increased the bar to reach level 4.
After extensive BETA testing to incorporate these changes, Version 2.1 of the Assessor will now be applied to PSA’s new clients and will be carefully monitored over the next two years.
コメント