Alemi, M., & Tajeddin, Z. (2013). Pragmatic rating of L2 refusal: Criteria of native and nonnative English teachers. TESL Canada Journal, 30(7), 65-83.
Bachman, L. F., Lynch, B. K., & Mason, M. (1995). Investigating variability in tasks and rater judgments in a performance test of foreign language speaking. Language Testing, 12(2), 238-257.
Bardovi-Harlig, K. (2001). Evaluating the empirical evidence: Grounds for instruction in pragmatics. In K. R. Rose, & G. Kasper (Eds). Pragmatics in language teaching (pp. 13-32). Cambridge: Cambridge University Press.
Barnwell, D. (1989). “Native” native speakers and judgments of oral proficiency in Spanish. Language Testing, 6(2), 152-163.
Bergman, M. L., & Kasper, G. (1993). Perception and performance in native and nonnative apology. In G. Kasper & S. Blum-Kulka. (Eds.), Interlanguage pragmatics (pp. 82-107). New York: Oxford University Press.
Billmyer, K., & Varghese, M., (2000). Investigating instrument-based pragmatic variability: Effects of enhancing discourse completion tests. Applied Linguistics, 21(4), 517-552.
Blum-Kulka, S., & Olshtain, E. (1984). Requests and apologies: A cross-cultural study of speech act realization patterns. Applied Linguistics, 5(3), 196-213.
Brown, A. (1995). The effect of rater variables in the development of an occupation-specific language performance test. Language Testing, 12(1), 1-15.
Brown, P., & Levinson, S.C. (1987). Politeness: Some universals in language use. Cambridge: Cambridge University Press.
Caban, H. L. (2003). Rater group bias in the speaking assessment of four L1 Japanese ESL students. Working Papers in Second Language Studies, 21(3), 1-44.
Chau, J. (2005). Effects of collaborative assessment on language development and learning. The Language Learning Journal, 32(1), 27-37.
Cohen, A., & Olshtain, E. (1981). Developing a measure of sociocultural competence: The case of apology. Language Learning, 31(1), 113-134.
Cohen, A. D., & Shively, R. L. (2007). Acquisition of requests and apologies in Spanish and French: Impact of study abroad and strategy-building intervention. The Modern Language Journal, 91(2), 189-212.
Dickinson, L. (1988). Collaborative assessment: An interim account. In H. Holec (Ed.), Autonomy and self-directed learning: Present fields of application (pp. 121-128). Strasbourg, France: Council of Europe.
Eckes, T. (2005). Examining rater effects in TestDaF writing and speaking performance assessments: A many-facet Rasch analysis. Language Assessment Quarterly, 2(3), 197-221.
Eckes, T. (2008). Rater types in writing performance assessments: A classification approach to rater variability. Language Testing, 25(2), 155-185.
Elder, C., Barkhuizen, G., Knock, U., & Randow, J. (2007). Evaluating rater response to an online training program for L2 writing assessment. Language Testing, 24(1), 37-64.
Engelhard, G. Jr., & Myford, C. M. (2003). Monitoring faculty consultant performance in the Advanced Placement English Literature and Composition program with a many-faceted Rasch model (College Board Research Report No. 2003-1). New York: College Entrance Examination Board.
Ervin-Tripp, S. (1976). Is Sybil there? The structure of some American English directives. Language in society, 5(1),25-66.
Galloway, V. B. (1980). Perceptions of the communicative efforts of American students of Spanish. Modern Language Journal, 64(4), 428-433.
Holmes, J. (1990). Apologies in New Zealand English. Language in Society, 19(2), 155-199.
Hudson, T. (2001). Indicators for pragmatic instruction. In K. R. Rose & G. Kasper (Eds.), Pragmatics in language teaching (pp. 283-300). Cambridge: Cambridge University.
Hudson,T., Detmer, E., & Brown, J., D. (1995). Developing prototypic measures of cross-cultural pragmatics. Honolulu, Hawai’i: University of Hawaii, Second Language Teaching and Curriculum Center.
Johnson, J. S., & Lim, G. S. (2009). The influence of rater language background on writing performance assessment. Language Testing, 26(4), 485-505.
Kasper, G., & Roever, C. (2005). Pragmatics in second language learning. In E. Hinkel (Ed.), Handbook of research in second language teaching and learning (pp. 317-334). Mahwah, NJ: Lawrence Erlbaum Associates.
Kim, Y-H. (2009). An investigation into native and nonnative teachers’ judgments of oral English performance: A mixed methods approach. Language Testing, 26(2), 187-217.
Knoch, U., Read, J., & von Randow, J. (2007). Retraining writing raters online: How does it compare with face-to-face training? Assessing Writing, 12(1), 26-43.
Kondo-Brown, K. (2002). A FACETS analysis of rater bias in measuring Japanese L2 writing performance. Language Testing, 19(1), 3-31.
Lee, H. K. (2009). Native and nonnative rater behavior in grading Korean students’ English essays. Asia-Pacific Education Review, 10(3), 387-397.
Leech, G. (1983). Principles of pragmatics. London: Longman.
Li, D. (2000). The pragmatics of making requests in the L2 workplace: A case study of language socialization. The Canadian Modern Language Review, 57(1), 58-87.
Lim, G. S. (2011). The development and maintenance of rating quality in performance writing assessment: A longitudinal study of new and experienced raters. Language Testing, 28(4), 543-560.
Liu, J. (2006). Assessing EFL learners’ interlanguage pragmatic knowledge: Implications for testers and teachers. Reflections on English Language Teaching, 5(1), 1-22.
Lumley, T., & McNamara, T. F. (1995). Rater characteristics and rater bias: Implications for training. Language Testing, 12(1), 54-71.
Matsumura, S. (2007). Exploring the aftereffects of study abroad on interlanguage pragmatic development. Intercultural Pragmatics, 4(2),167-192.
McNamara, T. F. (1996). Measuring second language performance. Harlow: Longman.
McNamara, T., & Roever, C. (2006). Language testing: The social dimension. Malden, MA & Oxford: Blackwell.
Olshtain, E., & Cohen, A. D. (1983). Apology: A speech act set. In N. Wolfson & E. Judd (Eds.), Sociolinguistics and language acquisition (pp. 18-35). Rowley, MA: Newbury House.
Plough, I. C., Briggs, S. L., & van Bonn, S. (2010). A multi-method analysis of evaluation criteria used to assess the speaking proficiency of graduate student instructors. Language Testing, 27(2), 235-260.
Roever, C. (2001). A Web-based test of interlanguage pragmalinguistic knowledge: Speech acts, routines, and implicatures. Unpublished doctoral dissertation, University of Hawai’i, Honolulu, Hawai’i.
Rose, K. R., & Kasper, G. (Eds.). (2001). Pragmatics in language teaching. Cambridge: Cambridge University Press.
Schaefer, E. (2008). Rater bias patterns in an EFL writing assessment. Language Testing, 25(4), 465-493.
Shohamy, E., Gordon, C., & Kraemer, R. (1992). The effect of raters’ background and training on the reliability of direct writing tests. Modern Language Journal, 76(1), 27-33.
Schoonen, R. (2005). Generalizability of writing scores: An application of structural equation modeling. Language Testing, 22(1), 1-30.
Taguchi, N. (2006). Analysis of appropriateness in a speech act of request in L2 English. Pragmatics, 16(4), 513-535.
Taguchi, N. (2010). Longitudinal studies in interlanguage pragmatics. In A. Trosborg (Ed.), Pragmatics across languages and cultures (pp. 333-361). Berlin: Mouton de Gruyter.
Taguchi, N. (2011). Rater variation in the assessment of speech acts. Pragmatics, 21(3), 453-471.
Tajeddin, Z, & Alemi, M. (2014). Criteria and bias in native English teachers’ assessment of L2 pragmatic appropriacy: Content and FACETS analyses. The Asia-Pacific Education Researcher, 23(3), 425-434.
Thomas, J. (1995). Meaning in interaction: An introduction to pragmatics. London: Longman.
Youn, S. J. (2007). Rater bias in assessing the pragmatics of KFL learners using facets analysis. Second Language Studies, 26(1), 85-163.
Weigle, S. C. (1998). Using FACETS to model rater training effects. Language Testing, 15(2), 263-287.
Wigglesworth, G. (1993). Exploring bias analysis as a tool for improving rater consistency in assessing oral interaction. Language Testing, 10(3), 305-335.
Wigglesworth, G. (1994). Patterns of rater behaviour in the assessment of an oral interaction test. Australian Review of Applied Linguistics, 17(2), 77-103.
Winke, P., Gass, S., & Myford, C. (2012). Raters’ L2 background as a potential source of bias in rating oral performance. Language Testing, 30(2), 231-252.
Zhang, Y., & Elder, C. (2011). Judgments of oral proficiency by nonnative and native English-speaking teacher raters: Competing or complementary constructs? Language Testing, 28(1), 31-50.