A Data-Driven English Classroom in Action: A Reflective Analysis
Thomas J. Bailey
In a previous essay called “Toward a Data-Driven English Classroom,” I reflected on how I collect data, what types of data, and how I use it to drive instruction. I outlined 5 essential questions that drive a data-driven classroom as well as specific data sets that could be used to get a fairly accurate profile of a student’s abilities in the English content area. The following essay is a reflection on those data sets and results from a cohort of students in the Fall of 2009 in block English 12 classes at Elmhurst High School
Specifically speaking, the prior article reviewed relevant data, essential questions guiding data usage, and strategies to drive instruction based on the data. As a review of relevant data, most of the data mentioned in the previous article was focused on reading and writing scores. This data included Lexile data, Acuity Test data, language conventions test data, and ETS Criterion data. The first two assessments listed, for the most part are dedicated to assessing reading, and the second two assessments measure writing and grammar. The essential questions on data usage are as follows:
1. What do I want my students to be able to do as a result of experiencing and learning from my class?
2. How will I measure that?
3. How does that align with state and local standards and prepares them for observable outcomes on exams?
4. Where are the students now?
5. How will I get them to the answers of questions 1 through 3 above?
Along the lines of those questions and leading onto instruction, I outlined that teaching and resources need to be dedicated to the explicit learning goals of the content area and that improvement needs to be the focus on a student by student basis.
For the sake of brevity, the premise of this study, then, is to review student data for two English classes that I taught in the Fall of 2009 and reflect on learning results based on the student data gathered therein. In this case, it must be mentioned that Acuity testing was not administered to these students because the 12th grade English courses do not have acuity tests designed to assess progress for this particular class because Acuity is reserved, at this time, to English 9 and English 10 classes, those classes which either prepare for or are subject to the ISTEP 10 End of Course Assessment.
Therefore, the assessment data sets were from the following assessments: the Language Conventions Test (my personal, local assessment), the ETS Criterion Essay assessment data, and SRI Testing (Lexile) data. This study will reflect on the results of the pretest and post test data in these assessments and what the results yield. In closing with each area of data, there will be a discussion on the learning experiences within the classroom and how they are related to the data at hand. The research here supports that measureable learning outcomes are predictable based on having relevant student data, clear desired outcomes, effective teaching practice, and reflection on results.
Language Conventions Test
As an overview, the language conventions test used to assess students’ understanding of language conventions and sentence structures has a format similar to that of the Test for Standard Written English. There are 25 multiple choice questions set up as complete sentences with four parts of the sentence underlined and matched with answers A, B, C, and D along with a E option which is no errors. The objective for the student is to read the sentence, evaluate its structure and punctuation, detect if an error has occurred, and identify the part of the sentence which has the error. This is a very difficult assessment for most high school students because grammar is rarely taught in the primary and secondary grades and it is rarely taught in the context of creating and correcting sentences. Rather, most grammar instruction in this district in my observations has centered along identifying parts of speech and phrases rather than on creation, constructing, and evaluating sentences. In this case, the pretest and the post test were only slightly different in that the pretest had a typographical error in one sentence and one question had two possible error answers. The post test was corrected accordingly. The test is available at the end of this article (No it is not…). The pretest was administered on September 9th 2009 and the post test was administered on January 13th 2010.
According to the pretest data, students in both English classes generally scored very low. For the pretests, the lowest score was 3 correct of 25 and the highest was 21 of 25 correct answers. For full student by student pretest and post test raw scores in both classes, see Figures 1 and 2. After comparing the pretest and post test data, most students significantly increased their scores. For student by student increases or decreases, see Figures 3 and 4. Between the two classes, a total of 5 students’ scores decreased or remained the same. The rest all showed increases in correct answers. In English 12 period 2, students’ raw scores increased, on average, by 5.412 answers and their percentage increased on average by 21.67%. In English 12 period 3, students’ raw scores increased on average by 4.67 answers and their percentages increased on average by 18.42% (See Figure 5). As reflected in Figures 3 and 4, there is a great deal of variance in the increases among students as many of the increases were dramatic in nature while some were modest. Overall, the increases in scores reflect the overall increase in student understanding of the concepts and rules the guide sentence structures and punctuation.
In all honesty, the increases in these statistical categories is the result of the following factors: lack of experience with this level of grammar coming into the class, practice in the rules in writing and evaluating sentence structures, and application of those rules in direct lessons on grammar, short writings such as quickwrites and responses to reading, and formal writing assignments (full fledged essays on various topics).
First and foremost, it is difficult for a student to succeed on a test especially if they have not faced a test of that style nor have they been equipped with the skills necessary to succeed in answering the questions correctly. Therefore, their scores were disproportionately low compared to anecdotal writing data, i.e. what they could actually write at the time they took this test and compared to their normal achievement data, i.e. what types of grades they would normally get on assessments and as class grades. Yet, the abysmal scores were a great place to start in being a source of motivation and teachable moment.
Secondly and most importantly, the skills needed to evaluate the sentences on this test were taught in a progressive, reflexive, and comprehensive manner through practice and reflection sessions. Early on in the semester, the material was taught starting with comma usage and focusing first on one basic skill with commas and then moving onto another skill. But when moving onto the second skill, practice and reflection was also done on the first skills. This progression took place until a total of 14 skills had been mastered in a progressive way. For example, the semester started with commas separating items in a list of particulars of three or more. The second skill was putting a comma after a transitional phrase but on that day, the students also practiced putting the comma in a list of particulars of three or more. This progression took place throughout the entire 18 weeks at a high frequency early on and then gradually easing off the frequency for long-term recall. The skills taught could be summarized as follows:
- commas in a list of particulars of three or more
- commas after transitional phrases
- commas surrounding appositive phrases
- commas in complex sentences (subordinate clause starting sentence)
- commas with independent clauses linked with “and”
- semicolon use
- colon use similar to semicolon use
- colon use leading into a quote
- colon use leading into a list
- comma usage leading into a quote
- citations and the ends of sentences (APA and MLA formats).
- who versus whom
- either/neither as a subject of a sentence
- apostrophe use
During the practice sessions too, the concepts were presented with a combination of abstract “recipes,” as I called them coupled with concrete examples. The sessions always included the creation of sentences based on the concepts as well as the correction of incorrect sentences based on the concepts. To add during the lesson, when students corrected sentences, they had to name the rule that was being applied. Feedback, because this was done using an LCD projector and students’ own papers and writing, was immediate and informative.
Lastly, the increases should also be attributed to the explicit requirements in short and long assigned writings throughout the course of specific and measureable usages of the rules in their writing. For example, when students wrote the compare/contrast essay between the Anglo-Saxons and the Medieval literature and heroes, they were explicitly instructed to use a certain number of semicolons, complex sentences, lists, and transitional phrases. Sometimes the writings were done in class as quick writes and journal entries. But there was usually a set of specific language conventions-based requirements even in those. For example, students wrote a quick paragraph on the King Arthur stories but had to include 4 transitional phrases in that paragraph.
All in all, the results here reflect that learning takes place through repetition, dispersed periodicity in the repetition, and comparisons between abstract and concrete examples. To add, the results also reflect that by requiring the same skill sets across a variety of outlets, i.e. practice sessions, formal writings, projects, and other assignments, the skill is reinforced and learned in the long term.
ETS Criterion Writing Data
In Fort Wayne Community Schools, English teachers have access to an invaluable writing resource in ETS Criterion which is a web-based online writing assessment tool. ETS Criterion has writing prompts for all grade levels all of the way up to the “College Second Year” level. For an overview of how Criterion assesses student writing the following webpage is available for information: http://www.ets.org/criteriontour.html . For the sake of brevity, the administration and data of the tests will be discussed next.
In August, students were assigned in Criterion to write for a prompt called “Multi-Media Teaching” and the grading rubric was set to the “College Second Year” level. This was done for two reasons. First, results from this test, generally speaking, had been reliable throughout the years reflect an accurate snapshot of students’ ability to write and formulate essays. Usually, what was reflected in the written responses for the prompts showed up in students’ daily and formal writing. These issues varied greatly but included the following issues and beyond: ability to start an essay at all, organization issues, answering the prompt directly, language conventions errors, homophone errors, speed of writing, etc. So the data in this type of report informs about students on an individual basis thus giving room for differentiation. The data also informs and is fairly accurate in giving holistic scores as well in the 6 point-rubric option. The second reason for setting the rubric to the “College Second Year” level resides in how my objectives are that even the most advanced students receive scores that force them to improve. If a student gets a 6 on the initial writing test, they may have the false sense of superiority or mastery of the content which could lead them to not focusing on the instruction leading to improvement. Therefore, I do so to force advanced students to advance.
There are some issues with the data sets which are about to be presented. First, the data is incomplete: some students did not post-test and therefore a second data report was generated using a combination of post-test data or most recent assignment data (which was a Shakespeare Essay prompt set to College Second Year). To add, a handful of students moved into alternative education programs or transferred to other schools and thus their data is reflected in the initial data sets but not the final (these were not the score 1 and 2 students in the pretest though). Lastly, Criterion sometimes does not assign a score to an essay because its algorithms are not equipped to compare their database of essays toward the essay at hand. Anecdotally and through evaluating the non-scored essays, I would propose that to include that data would actually skew the improvement of the scores upward compared to the hard data currently at hand. In these cases, this explains why the second sets of data do not have the same totals of essay scores compared to the initial pretest. Therefore, the data is compelling overall even with these issues but not conclusive although informative toward student improvement.
Data Sets and ETS Criterion
For both English 12 classes, the Criterion essay “Multi-Media Teaching” was administered on September 1st 2009 and the final exam prompt on “Emailing and Texting” was administered on January 13th 2010. Both tests had a three day window to allow extra time for students with individual education plans that required extra time. To add, both prompts were set for “College Second Year” as the default evaluation level.
That being said, three sets of data are presented here to give a more comprehensive picture rather than just the pretest and post test data. This is the case because due to absences and other academic circumstances, some individuals were not able to complete the final exam but were able to complete the recent assignment before the exam which was an essay about a Shakespearean play for which the rubric was also set to “College Second Year.” Therefore, the data sets presented are meant to show a trend rather than a specific, student by student correlation although the overall data is fairly conclusive.
In Figures 6, 7, and 8 the English 12 Period 2 pretest, post test, and most recent assignment data are presented. The graphs show the frequency of scores at the 6 levels that Criterion assesses. Obviously, levels 1 through 3 represent shades of the standard below the acceptable grade level. Levels 4, 5, and 6 represent the different levels of passable essays ranging from at the standard to excellent. As is shown in Figure 6, the frequencies of levels 1 and 2 essays are at 2 each while levels 3 and 4 occurred 18 times, and level 5 occurred 10 times in the English 12 Period 2 class. A similar distribution of scores is reflected in Table 4 for the English 12 Period 3 class as well. These tables (Figures 6 and 9) represent the initial writing scores for both periods.
In comparison, the post-test Final Exam scores are reflected in Figures 7 and 8 respectively. The issue with the data, though, is that the number of students who completed this assignment is significantly less than those who pretested because many students were making up other parts of the exam or were absent on the day of assignment. Nonetheless, the frequencies of scores of 4 and 5 are significantly higher on a per student basis when compared to the initial pretest. This begs the question, then, if this data reflects students’ work achieved significantly higher because they were able to accomplish the task while those who may have been behind would have scored lower though their data is not present. In this case though, there was a balance in the number of students who did not take this prompt who normally achieve at high levels and those who normally achieve at appropriate or below grade levels. In fact, some students wrote for this prompt instead of the major project and these students would be considered low achievers. Thus the conventional wisdom would be that their scores would have been lower. While the data comparing Figures 6 and 7 as well as Figures 9 versus 10 suggests a slide upward in student writing abilities, the data is still inconclusive.
To get a more accurate reflection on improvement, comparing the data between Figures 6 versus 8 and Figures 9 versus 11 would be more appropriate. This data in Figures 9 and 11 represent the score reports of the assignments for the Final Exam or the Shakespeare Essay assignment. In both cases, the grading prompt was set at “College Second Year.” If students completed the Final Exam, then that data was included. If they did not complete the Final Exam, then Shakespeare Essay was used as the determining data set for that student.
Comparing the Pretest Data for both class periods (Figures 6 and 9) versus the Composite Data for both class periods (Figures 8 and 11) reveals the general distribution upward in scores. While this comparison probably would not statistically pass the test for significance and specific accuracy, coupled with the trend upward in language conventions test data as well as anecdotal in student projects and in-class writing assignments, this data does point to improved writing for these students.
In fairness, in future attempts to track and record this data, more accuracy has to be a focus. Nonetheless, the composite work of the students in this class, their test data, and the work in their projects and writing throughout the course certainly reflect significant improvement in student skills in their writing despite the holes in the data. Yet as a professional, I am still disappointed with the lack of scores of 6.
In retrospect, the use of the Criterion data was effective in driving individual instruction and served as an accurate representation of student abilities. This serves as a positive and effective teaching practice for evaluating and driving instruction towards improving students’ writing.
From an anecdotal perspective, the essays from the Pretest, regardless of the scores and across the board, were significantly inferior compared to the essays after the instruction was administered regardless of which assignment was reflected in the report. For example, students’ essays were significantly longer in the last essays when compared to the first. Paragraphs were longer and had more sentences, complex and simple. Sentences were of greater variety and structure with use of transitions, appositive phrases, semicolons, colons, lists of particulars, and the integration of research and citations (this was a requisite for the Final Exam prompt as well).
As it stands, the improvement in writing skills can be directly attributed to the quality instruction and engagement by students. The exercises used to improve the language conventions test results were also directly tied to focusing on creating and writing varied and complex sentences. The added dimension of paragraphing and the instruction dedicated to organizing a paragraph and an essay accordingly led to improved writing skills. For example, body and singular paragraphs were taught in the terms that “paragraphs are sandwiches to be served with bread, meat, cheese, and bread” a mnemonic that effectively and logically was learned by the students. Additionally, formulas for introductions (attention-getting device, relating step, preview step, thesis) and conclusions (restate thesis, recapitulation of main points, and closing statement) also positively affected student learning because of the practice and repetition. While some may question the narrow focus of these formulas, two pedagogical considerations are made: you have know the rules to break the rules; it is better to master a few things perfectly than to do many things poorly.
Lexile Data Results
Fort Wayne Community Schools requires all language arts teachers to administer the SRI (Student Reading Inventory) test to all students enrolled in an English class at the beginning of every new year and new semester. The SRI Test is a computer-based test in which students read passages and then answer questions based on the reading passage in which they must choose answers based on their understanding of the reading vocabulary. The test yields a score from 0 to 1700+ that reflects the vocabulary reading level of the student. While the scores generally represent a range score and they are loosely tied to ranges in grade levels, the scores do a fair job of reflecting a student’s current reading vocabulary level. There have been exceptions when students have scored significantly too high and too low based on a bad test date (several reasons could be the source of this) but the scores are certainly important in driving instruction and determining developmentally appropriate texts and to identify those students who will need reading support and strategies when faced with various reading materials.
Fort Wayne Community Schools does not, on the other hand, require a second administration of the test toward the end of a course but in my English classes, I re-administer the test to track student growth or maintenance. Therefore, my students tested in August 2009 and January 2010 at the respective beginning and end of the semester. Obviously, due to transfers and absences, four students did not test a second time between the two class periods and therefore, their data is omitted. Also in the interest of privacy, students’ names are not listed with the data sets.
Before the discussion of data here, some points need to be made about Lexile data and how this data has been disaggregated. For the sake of this study, an increase or decrease above 50 points is considered significant. Increases or decreases within that 50 point range represent no significant change. Finally, students that pretest at a level of 1200 or higher are considered pretesting at a high level. Consequently, students who already test at high levels may show disproportionate drops in growth due to the high initial test, especially if another early test sample is not taken.
The results are as follows. In English 12 Period 2 the number of students who increased significantly was 13. The number of students who stayed within their range was 11. The number of students who decreased significantly was 4. Of those 4, 3 are considered to be high level pre-testers (See Figure 12). In English 12 Period 3, the number of students who increased significantly was 9. The number of students who stayed within their range was 16 (See Figure 13). There was only one student whose score decreased significantly and this student was a high pre-tester.
The increases in Lexile scores in both classes are significant. To add, the high percentages of those who stayed within range as well as those who pretested at a high score reflect that both classes had significant groups of students who were already reading at a high level in the class. The greater improvement in Lexile scores in English 12 Period 2 versus English 12 Period 3 also reflects that Period 3 came in with stronger Lexile scores and therefore had less room for improvement. This was replicated in the Language Conventions test data (See Figures 1 through 5).
The underlying argument is that student achievement and learning improves because of effective, data-driven instruction. If teachers can get an accurate profile of students’ abilities and they know what the final reflection of students’ learning should look like, they can effectively create the roadmap to reach improvement for those students. The data on student writing, understanding of language conventions, and reading data presented here reflect this direct correlation.
Lastly, I want to mention here some very important revelations in having experienced this type of success. First, this type of success is rare and I humbly credit the students for having the willingness to buy into what I was trying to teach them and maintaining a positive attitude and rapport with me. I am thankful that they listened to me and learned from me. Secondly, at Fort Wayne Community Schools, we have some resources that other schools simply do not have, and these resources, in this case, were allocated and completely focused toward specific learning goals. To add, this experience is also the reflection of 3 years of working, reflecting, gathering resources, figuring out how to best use those resources, narrowing and maximizing the classroom focus and experience, and putting together a cohesive plan of action to serve these kids a great English 12 classroom learning experience. Finally, I would like to thank the administration at Elmhurst High School for giving me the resources and the pedagogical freedom to teach with courage and to work creatively and think outside of the box.
On a final note, I do regret that my discussions on actual teaching practice were as brief as they were. Teaching to help these kids succeed to these levels was much more effort and complication than one should guess from the scant writing on the classroom experiences. This was about knowing every single one of my students from a relational standpoint as well as a professional one. To add, I regret too that I did not discuss the role of Quadrant D lessons in making this come about as they played a role in leading to these achievement results. Between the Quadrant D lessons and the daily norms and routines of practice and perfecting, there is enough material to serve as its own written reflection. This analysis was specific to addressing the before and after effects and data of an effective classroom, not on the roadmap to make it there. To add, if the essay “Toward a Data Driven Classroom” was an overview of what I was doing or what I was going to do, this outcomes in this essay reflect the results of doing those things.