Syntactic Complexity of Russian Unified State Exam Texts in English: A Study on Reliability and Validity

Document Type: Research Article


Kazan Federal University, Kremlyovskaya, Russia.



In this study we analyze texts used in Russian Unified State Exam on English language. Texts that formed small research corpora were retrieved from 2 resources: official USE database as a reference point, and popular website used by pupils for USE training “Neznaika” ( The size of two corpora is balanced: USE has 11934 tokens and “Neznaika” - 11918 tokens. We share Biber’s point of view that linguistic tendencies are quite stable with ten text samples per genre or register (Biber 2007). We retrieved 20 texts from each resource. This research takes into consideration analysis on syntactic complexity, the main subject of research is the syntactic type of the sentence. The present study focuses on two research questions: RQ1: What sentence types pattern is typical for USE texts? RQ2: Are the materials of the training sites reliable and valid? The methods employed in the study are the identification and manual counting of the sentence types, absolute and normalized frequency calculation. While analyzing the texts, we witnessed greater range of tokens per text (tpt) in unofficial texts for training. For “Neznaika” database the range was 490 - 790 (tpt), while an official USE database texts demonstrated lower variance: 539 – 686 tpt. The number of sentences in “Neznaika” (664) and official USE texts database (670) is almost equal. The number of sentence types in “Neznaika” and official USE texts database also does not extend correlation limits.