FROM DIAGNOSIS TOWARD ACADEMIC SUPPORT: DEVELOPING A DISCIPLINARY, ESP-BASED WRITING TASK AND RUBRIC TO IDENTIFY THE NEEDS OF ENTERING UNDERGRADUATE ENGINEERING STUDENTS

This paper reports on the central role of disciplinary (engineering) criteria in the development of an ESP-based diagnostic writing task and rubric, used to identify entering undergraduate engineering students in need of academic support. In this mixed methods study, Phase 1 investigated the usefulness of a generic writing task and analytic rubric used for the diagnosis. Phase 2, informed by the results of Phase 1, focused on the development of an engineering writing task. The outcomes of the two phases were merged to develop an engineering ESP-based writing task and rubric, informed by a) the collaboration of language/writing experts and engineering stakeholders, and b) criteria, indigenously drawn from the engineering community of practice. The study supports an academic literacies approach in diagnostic assessment (rather than a generic, one-sizefits-all, ‘academic literacy’ approach), and suggests that the demands of university study are best viewed as the practices of disciplinary communities of practice. The paper provides evidence of the increased meaningfulness and usefulness of a disciplinary, ESPbased approach in diagnosing need for academic support.


INTRODUCTION
The first-year experience has become the focus of international research on attrition and retention in tertiary education, as significant numbers of students continue to drop out of their first-year undergraduate programs (e.g.Browne & Doyle, 2010;Fox, Haggerty, & Artemeva, 2016).Early intervention and support for students who are struggling with the demands of a first-year program have been shown to make a meaningful difference in their ultimate academic success (Read, 2016a).An increasing number of universities offer such support, for example, in academic success or writing centers; however, the generic nature of these pedagogical initiatives has been questioned, especially in the Canadian university context (e.g.Fox, Haggerty et al., 2016;Fox, von Randow, & Volkov, 2016).
In Canadian universities, students tend to select their majors at entry and begin their degree programs at the outset of their first year.For example, in Vol.5(2)(2017):  professional programs such as engineering, students begin courses specific to their degree immediately.Thus, in this context, requirements for disciplinary literacies (e.g.Lillis & Scott, 2007;Street, 1999Street, , 2010) ) become particularly pertinent, necessitating pedagogical responses that are informed by both disciplinary and language/literacy expertise.
Further, given Canada's two official languages (English and French), a large population of speakers of other languages due to high numbers of immigrants throughout Canada's history and a growing number of international students, English-medium university classrooms are extraordinarily diverse -culturally, educationally, and linguistically (e.g.Artemeva & Myles, 2015;Fox, Cheng, & Zumbo, 2014).
The research literature suggests that disciplinary literacy and academic resources are key variables in retention and program completion (Fox, 2005;Meyer & Land, 2003).English as a Second Language (ESL), English for Academic Purposes (EAP), and English for Specific Purposes (ESP) courses in colleges and universities are designed to increase the language proficiency of students who speak English as an additional language (EAL), and who do not meet English language requirements.ESP courses (unlike the more general language proficiency approaches of ESL, or the general academic language proficiency/academic literacy approaches of EAP) have typically addressed the disciplinary and professional language needs of EAL students whose goals are to enter and participate in their respective disciplinary communities of practice (Wenger, 1998), or CoPs.But, even though these courses provide useful support, Feak (2016: 493) laments that they "target lower-level students" only (in terms of language proficiency), and that the students who are "deemed to have a high or a high enough level of English proficiency" − and, we may add, speakers of English as a first language, − "may have limited, or perhaps no, access to […] support".In response to these concerns ESP has been expanded to a consideration of "the specific communicative needs and practices of particular social groups" (Hyland, 2007: 391), thus addressing the needs of not only EAL students but also their firstlanguage (L1) counterparts, who must also develop specialized disciplinary and professional varieties of English (cf.Douglas, 2013;Feak, 2016) in order to participate in the literate activities of their CoPs.As Prior (1998: 32) observes, such literate activities are "central to disciplinary enculturation [...] for foregrounding representations of disciplinarity, and for negotiating trajectories of participation in communities of practice".However, the literature on writing assessment practices at admission (e.g.Read, 2016a) suggests that disciplinary dimensions are rarely considered when it comes to creating tasks and rubrics.Rather, the emphasis is placed on general academic writing proficiency/academic literacy (e.g.language control, organization, broad rhetorical patterns).
In this paper, we draw on the expanded view of ESP in our consideration of a diagnostic writing assessment, developed in response to the needs of all first-year students (regardless of language background) in a Canadian undergraduate Vol.5(2)(2017): 148-171 As with other PELA initiatives, diagnosis and intervention have been primarily focused on increasing a student's overall facility with academic language by targeting "reading, writing, listening skills, but often [...] [including] measures of language knowledge (vocabulary or grammar items [...]) which are seen as adding diagnostic value to the assessment" (Read, 2016b: 6).The emphasis on generic academic literacy/language is also clear in guidelines for raters, which define scores or levels in order to characterize a test-taker/student's performance, behavior, or work (Cheng & Fox, 2017: 123-128) (i.e.PELA rubrics/rating scales).For example, the DELNA rating scale, which is applied to the task presented in Figure 1, includes fluency (defined by criteria relating to organization, paragraphing, length, general academic style); content (defined by criteria relating to trends in the graph, their detailed interpretation, and implications); and form (defined by criteria relating to grammar, syntax, vocabulary, and spelling).The criteria correspond to a scale from four to nine, with students scoring at four  However, as Feak (2016: 493) has noted, "the link between language proficiency and academic success is tenuous"; after all, retention is an issue in the first year of undergraduate study, regardless of a student's language background (Browne & Doyle, 2010;Fox, Haggerty et al., 2016).Read (2016b: 6) acknowledges Feak's concern, in the DELNA/New Zealand context, remarking that given the diverse linguistic, social, and cultural backgrounds of post-secondary students, we can no longer assume that all first-year students "have an acceptable level of academic literacy".A number of researchers (cf.Fox et al., 2014;Noceti, Chacón, Chiarella, & Erbetta, 2017) have argued that the type of specialized academic support available to EAL students in EAP and ESP courses should be available to all students.For example, in the context of undergraduate engineering, which presents "extremely challenging" disciplinary issues to entering students, Noceti et al. (2017: 1) point out that "academic literacy in the mother tongue is similar to learning a foreign language as it involves immersion in a new culture".If we are to address issues of retention and academic success, we should develop a better understanding of the needs of all entering students and develop diagnostic assessment procedures that yield specific information regarding key indicators of risk (e.g.academic aptitude/readiness [Strayhorn, 2013], threshold concepts [Meyer, 2010]).We argue that such indicators should reflect practices, expectations, and academic literacies that are specific to a student's target discipline (in the case of the present study, the first year of undergraduate engineering).As Jacoby and McNamara (1999: 224) put it, "the development of professional" -and we add, disciplinary, -"competence is but a specialized form of socialization, a general social and interactional process long recognized as the vehicle through which culturally specific knowledge, language, discourse, cognition, skills, and practice are transmitted and developed".
In other words, diagnostic assessment should consider communication as "an indigenous problem for members of some specific culture to grapple with when confronted with anyone's performance (novice or experienced, native speaker or non-native speaker)" (Jacoby & McNamara, 1999: 224).It follows that in order to assess student readiness to engage in the work of a discipline, it is necessary to identify criteria specific to the disciplinary domain, or "indigenous" criteria.These criteria will be most useful if they are drawn from the disciplinary (indigenous) CoP, and include not only "the specialized vocabularies, concepts and knowledge" of the academic discipline but also "accepted and valued patterns of meaningmaking activity (genres, rhetorical structures, argument formulations, narrative devices, etc.) and ways of contesting meaning" (Murray, 2010: 351).
As Alderson (2007: 38) points out, "much clearer thinking is needed [...] to define what we need to know in order to be able to provide adequate diagnoses for learners, on which useful feedback and advice can be given".Relationships between analytical rubrics and useful feedback in support of learning are well-Vol.5(2)(2017): 148-171 documented in the literature (e.g.Hack, 2015;Sluijsmans, Brand-Gruwel, & van Merrienboer, 2002).In the context of a diagnostic assessment like the one considered here, it is essential to develop and validate rubrics, which provide substantive analytic descriptors of key indicators of test-taker academic potential.These descriptors are the primary source of feedback for subsequent academic support.Below, we report on the increased meaningfulness and usefulness of a disciplinary, ESP-based approach in diagnosing risk through the development of an indigenously drawn diagnostic writing task and rubric.

RESEARCH DESIGN
The research reported here is part of a longitudinal mixed methods study with a multistage-evaluation design (Creswell, 2015: 47).It is multistage in that each stage is comprised of one or more phases, which nonetheless share a common purpose.In this paper, we present the foundational stage of the study, comprised of two phases, which led to the development of an indigenously drawn (cf.Jacoby & McNamara, 1999) ESP-based diagnostic writing task and rubric.In Phase 1, we investigated the meaningfulness of the generic DELNA writing task and analytic rubric as a source of feedback to support the learning of individual entering undergraduate engineering students.Informed by the results of Phase 1, in Phase 2 we developed a disciplinary engineering writing task but continued to evaluate responses with the generic analytic rubric.Subsequently, we merged findings from Phases 1 and 2 to design and evaluate the usefulness of an indigenously drawn ESP-based writing task and rubric.

Phase One
Having received approval from a university Research Ethics Board, we first examined the usefulness of the DELNA (generic) task and analytic rubric in identifying entering undergraduate engineering students in need of additional academic support.

Participants
Twenty-eight participant-raters (Figure 2) were recruited for Phase 1 of the study:  Group 1: 17 raters had language/writing studies backgrounds (including three engineering communications course instructors) and were trained online and certified as DELNA raters (see www.delna.auckland.ac.nz/ and [Elder et al., 2007]); Vol. 5(2)(2017): 148-171  Group 2: 11 raters had engineering backgrounds, representing different engineering-related CoPs, including two practicing engineers, one Associate Dean of the Faculty of Engineering, five engineering professors, and three engineering communications instructors with engineering backgrounds.These 11 indigenously drawn raters were also trained to use the generic DELNA rubric.

Instruments and procedures
The participant-raters were asked to use the generic DELNA rubric in their assessment of ten samples, which had been randomly selected from 103 test responses of entering first-year engineering students to the generic DELNA writing task.When the participant-raters had finished assessing the samples, they were interviewed individually in semi-structured interviews, which asked them to explain their assessment of each sample, identify qualities they observed in the student writing as strengths and/or weaknesses, and relate these to the DELNA analytic rubric.

Analysis
Interviews with the participant-raters were audio recorded, transcribed, and analyzed, through qualitative thematic coding using a constant comparison method (e.g.Saldaña, 2016).In exploring the responses of the participant-raters to the ten writing samples, we were interested in what repeated and what differed (cf.Paré & Smart, 1994) across responses of raters with non-indigenous, language/writing studies backgrounds, and those with indigenous, engineering backgrounds.
Vol. 5(2)(2017): 148-171 To evaluate the meaningfulness of the assessment results, a case study approach (Yin, 2009) was used to investigate how students' performances on the generic DELNA writing task related to their subsequent academic performance in engineering courses.In other words, we asked the question, were the inferences drawn from the student performances on the writing task indicative of academic performance in the first term of undergraduate study in engineering?
After the ten students, whose sample responses to the generic DELNA writing task had been previously assessed by the participant-raters, had completed their first-term courses and their final marks had been submitted, we undertook another series of semi-structured interviews, this time with their engineering communications course instructors.In the interviews, we asked the instructors to describe the students' actual course performance.The interviews allowed us to evaluate the quality of inferences drawn from a diagnosis of at-risk (or not-at-risk) in writing for engineering purposes in relation to actual performance in the communications course.In addition, we examined the course outcomes (e.g.grades, withdrawals, failures) in all of the courses they took during their first year of study (i.e.mathematics, introduction to engineering, physics, chemistry, electives).We focused on the following questions: Were key indicators well identified by the generic diagnostic writing test results?What was the relationship between rater background, test results, and student course performance?

Phase One findings
Analysis of the responses provided by the raters across Groups 1 and 2 indicated a satisfactory level of agreement between the groups, regardless of the participantrater's background, with exact agreement of 80%, and classification agreement (i.e.at-risk or not-at-risk) of 90%.However, as McNamara (2000: 37) points out, in assessment, "there is as much variation among raters as there is variation between candidates".In spite of the overall consistency of raters' responses, we suspected there might be varying interpretation of the analytic criteria in relation to a participant-rater's disciplinary background and expectations.In order to explore what differed across the two groups, we interviewed each of the 28 raters about their assessment decisions.Table 1 provides an overview of the responses of the two groups of raters to the ten cases.The semi-structured interviews revealed significant differences between the two groups of raters.The raters in Group 1 (with language/writing background) focused on the specific marking criteria in the rubric and referred frequently to it in order to approximate the generic DELNA rubric's benchmark levels from four to nine.For example, they paid explicit attention to expectations for paragraphing (i.e. one paragraph for each of the three prompts identified in the task directions as in Figure 1), and awarded higher marks for length requirements (200-250 words, as spelled out in the task instructions and the rubric).They expected (as the rubric indicated they should) that writers would use or discuss the figures in their responses and awarded higher marks for accuracy of detailed descriptions of the information in the graph.They did not refer to the target domain, namely, undergraduate engineering; rather, when context of use was taken into account, it was a general academic literacy/language context or an academic writing context, as evidenced by the comments of two raters below: I've had students like this, you know.They work so hard but they really need more help in writing and language than we can possibly give them.I'm working with over 150 students in three sections [classes].It's just impossible.
When I look at this writing, it reminds me so much of a student in my class who has exactly this same set of language issues.
The comments of the raters with language/writing backgrounds differed dramatically from those with the indigenous (engineering) backgrounds (Group 2).
Vol. 5(2)(2017): 148-171 The engineering raters interpreted all of the analytic criteria in relation to their expectations of writing for engineering purposes in undergraduate courses (cf.Curry, 2014;Winsor, 1996).For example, although they made passing references to and took language features into account, they rejected the rubric's advice regarding paragraphing and length because the rubric contradicted effective writing practices in engineering (cf.Artemeva, Logie, & St. Martin, 1999).One of the engineering professors commented: The paper that I thought was written by an extremely proficient [writer] is the shortest paper in the whole lot -only one page.It's not the greatest paper in the world, but it's coherent . . . the content is presented very succinctly.The second task is not paragraphed and it all makes sense.
Another engineering professor found the generic DELNA rubric did not sufficiently account for a key underlying problem in the response of an at-risk student: Well, one very common thing that really hit me very hard is that this person . . . is so confused.This is really not a language thing.I'm getting some of this with my thirdyear students.This person . . .uh . . .they would confuse what the coordinate is and what the value is and those are, well . . .probably not a writing thing . . .probably a thinking thing . . .[the student] just [lacks] the ability to get information; to get information in one form and process it in another form.They process and arrange it but it is not logical.
An engineer who was teaching the engineering communications course at the time of the study pointed out the same issue, stating, "This isn't a writing problem; this is a thinking problem".
In sum, the most striking difference between the responses of the raters with language/writing backgrounds and the raters with indigenous backgrounds was their valuing of domain-specific requirements for engineering writing.The two groups of raters had differing understanding of what was appropriate and what would work in terms of content, organization, disciplinary rhetorical expectations, emphasis, logic, and so on.In other words, their interpretations of how writing is structured or shaped in response to an engineering context of use differed sharply.Because the indigenous raters drew directly on their experience with disciplinary engineering writing (the target domain) (cf.Artemeva, 2008;Artemeva & Fox, 2010;Fox & Artemeva, 2011;Hyland, 2012), they had difficulty applying some of the analytic criteria defined in the generic DELNA rubric.They were also critical of the generic DELNA task.One engineering professor, relating the DELNA task to those undertaken by engineering students in the first-year classroom, spoke from the perspective of the engineering CoP (see the use of "we" in the excerpt below) in criticizing the task: Vol. 5(2)(2017): 148-171 I think the way the first task is presented . . ."describe the information in the graph" is not helpful at all . . .because nobody describes graphs.This is not how we read a graph.So The Associate Dean of the Faculty of Engineering made a similar comment responding to a histogram included in the generic DELNA writing task, also speaking as a member of the engineering CoP: "we don't . . .I mean what can you say about this graph?It's so simple . . .yes, I know some students get it wrong, but there's really so little to say about this.It's . . .it's just not complex enough."Yet another engineering professor noted, again in reference to the expectations of engineering writing, "We don't do histograms!"Further, a number of raters with engineering background objected to rewarding extended length and multiple paragraphs as evidence of effective writing (as defined by the DELNA rubric).As one of these raters remarked, "concise, clear, to the point -that's what we want in engineering -not a bunch of details".Elder and McNamara (2016: 153) point out that incorporating "insights from domain experts into how they view communication in real world settings is recognized as an important authenticity consideration in the development of criteria to assess language proficiency for specific academic or occupational purposes" (cf.Curry, 2014;Gimenez, 2014;Winsor, 1996).The results of the study support Elder and McNamara's observation.However, had we relied on the high level of overall agreement alone, and not systematically investigated the raters' responses to the students' writing through one-on-one semi-structured interviews, we might have missed key underlying differences between the groups of raters with language/writing backgrounds and those with engineering backgrounds.
Subsequently, through the analysis of the ten randomly selected student cases (Table 2), we were able to evaluate the meaningfulness and appropriateness of the "at-risk" designation.
After the end of the first term, we interviewed four engineering communications course instructors who had taught one or more of the ten students.We were particularly interested in how these ten students, some of whom both groups of raters had deemed to be at-risk, fared in the first term of their engineering program.Two of the students identified as at-risk (cases 9 and 10, Table 2) dropped the program without completing their first courses, one of them after two months (case 9) and the other after only a month of instruction (case 10).
Vol. 5(2)(2017): 148-171 The engineering communications course instructors who were interviewed noted that both students had challenges meeting the demands of the courses, either needing so much individual help that the instructor could not respond intensively enough to support the student, or leaving, as one of the instructors noted, "after failing the first assignment".
Further, other cases (for example, case 3, Table 2) allowed us to focus our attention on key variables that were not measured by the writing task, but clearly had an impact on a students' academic success.For example, although both groups of raters a six to case 3's writing (i.e.weak writer), they did not consider the student to be at-risk, and yet, as Table 2 reports, this student failed.The interviews with the student's engineering communications instructor suggested that this student had other issues than writing alone, namely, a lack of motivation (e.g."no effort"; "missed many classes") and poor comprehension of what was expected and how to meet those expectations (e.g."comprehension very poor") − once again indicating that it is not language/writing proficiency alone that affects student success at university.As an engineering professor had observed, "students often fail because they do not understand what is being asked of them".Subsequently, close examination of the communications instructors' comments confirmed that in addition to language/writing proficiency and understanding academic expectations, a student's motivation and persistence were also key variables in a student's ultimate success.
In a discussion of other students who were struggling to meet the expectations set by the engineering communications course instructors, one of the instructors, who had been teaching in the program for three years at the time of the study, described students who "just didn't get it"; who needed "lots of intervention" (as in case 7, Table 2).The instructor lamented that "they need so much help" and complained that, given the size of the classes she was teaching, and the number of classes she taught concurrently, she simply did not have the resources necessary to help such students.She noted that students fail every year, because, after exhausting the intensive support she could provide for such students, she "had no place to send them; they had no place to go" as supplementary access to domain-specific, engineering-relevant support was limited at the time.
In the section below we present Phase 2 of the study, which focused on the development of an engineering writing task, which, we expected, would provide a better, domain-relevant alternative to the generic DELNA writing task.

Phase Two
In Phase 2 of the study, we collaborated with engineering stakeholders and developed an engineering writing task.To evaluate the usefulness of the engineering writing task, we administered it alongside the generic DELNA writing task and applied the generic rubric to both.We analyzed the outcomes of the writing tasks by eliciting stakeholder comments.

Participants
Our participants in Phase 2 were drawn from differing engineering-related CoPs at play in the context of university engineering:  Sixty-three engineering students at the end of their first year of study  Two engineering communications course instructors with language/writing backgrounds

Instruments and procedures
First, in order to develop the domain-specific engineering writing task, we asked the engineering practitioners to provide us with a topic and corresponding graph for an engineering task based on the DELNA model (i.e.written interpretation of information presented in a graph or graphs).The task proposed by the engineering practitioners required students to explain the relationship between forwarddirected force (thrust, measured in Newtons) and speed (measured in meter per second) in an experimental car, and interpret a graphical representation of a function of speed vs. time.Second, to assess whether the engineering writing task would provide more useful information than the generic DELNA task, we administered both tasks to 63 engineering student participants, enrolled in two sections (classes) of the engineering communications course at the end of their first year.It should be noted that it is common practice in test development to "field test" new versions of tests/tasks with a range of relevant stakeholders.Only when tests/tasks are refined through this process are they administered to the target group of test takers.
The student participants in the Phase 2 field test were randomly assigned to two groups: 31 to the generic task group and 32 to the engineering task group.Test developers administered the tests, recorded field notes during administration, and elicited student responses to the tasks, after the test, by showing PowerPoint slides of the two tasks and asking students to comment on their experiences responding to them.
Third, we interviewed the two engineering communications course instructors, asking them to comment on the two tasks in relation to their experience with engineering students in their classes.

Analysis
Differences in overall scores by group were evaluated in relation to test administrator field notes, and student and instructor feedback on the difficulty of the two tasks.

Phase Two findings
Analysis of student writing on the two tasks (generic DELNA and engineering) suggested that neither task was perfectly suitable for diagnosing needs for academic support.On the one hand, as described above, the generic DELNA task elicited detailed descriptive writing about a histogram, which did not represent writing for engineering purposes.On the other hand, the engineering writing task was too content-intensive, eliciting more information about students' understanding of the relationship between thrust and speed in an experimental car than their ability to write academically for engineering purposes.
Students in the generic task group generally finished early, tending to respond quickly, but filling the page with descriptive details about the histogram.In contrast, students in the engineering task group complained to test administrators that they did not have enough time to finish the writing task.Their responses tended to be incomplete, and much shorter than those of the generic task group's.The test administrators also noted that students in the engineering task group took much more time to think about and interpret the graph.
Analysis of the writing produced in response to the two tasks demonstrated that the generic task group performed at a significantly higher level, but students' comments and interviews with their engineering communications course instructors indicated that the task was much easier and did not reflect the target domain (writing for engineering purposes).In contrast, the student responses to the engineering writing task were either very technical or incomplete and confused.
Further, both tasks were parachuted into an engineering communications course without providing students with any background preparation for the tasks or their responses.
These findings prompted us to look for a middle ground by developing a new writing task and rubric that would take into account useful information obtained from both tasks, namely, the relevant language-related information elicited by the generic DELNA task and the domain-specific information elicited by the engineering writing task.Merging findings from Phases 1 and 2, we developed a new disciplinary (engineering) ESP-based writing task and rubric.that is, it sets out to measure general academic writing.The task (Figure 1) is not intended to be disciplinary.In fact, in a number of respects, the DELNA task elicited responses that appeared to contradict engineering practices, within both academic and professional domains (cf.Artemeva & Fox, 2010;Curry, 2014).For example, as noted earlier, indigenous engineering raters in Phase 1 observed that histograms are not generally used in engineering, and short, concise interpretations of graphs (rather than extensive descriptive details) are germane to engineering writing.Phase 1 findings highlighted: 1) the role of raters' backgrounds in their application of the generic analytic rubric, and 2) the inappropriacy of some generic academic writing criteria in this engineering disciplinary context.Drawing on the comments of the engineering raters, we reconsidered the generic DELNA emphasis on general academic writing proficiency, and revised and expanded the criteria to include specific engineering expectations.That is, in this target domain, student writing was expected to be a) accurate in content; b) attentive to overall trends in the graph(s) (rather than presenting descriptive details); c) succinct (e.g.minimized length, paragraphing); d) logical; and e) appropriately formatted to highlight main points (e.g. point form, lists).This rhetorical representation of accurate, domain-specific content is at the core of the engineering disciplinary community's ways of being, doing, and thinking (cf.Gee, 1999) that serve as "the intellectual scaffolds on which community-based knowledge is constructed" (Berkencotter & Huckin, 1995: 24).As such, it was clear that a diagnosis of needs for academic support in engineering writing must take indigenously drawn criteria into account.
In Phase 2, in order to develop a domain-specific writing task, we asked engineering practitioners to provide us with an engineering topic and a corresponding graph (cf.Curry, 2014;Noceti et al., 2017).Although the topic was identified and deemed to be at a level of difficulty appropriate for most entering undergraduate students, the subsequent comparison of engineering students' responses to the generic DELNA and engineering writing tasks demonstrated that: 1) the generic task was too easy and the information derived from the application of the rubric was insufficient and, at times, irrelevant in informing the diagnosis and subsequent useful feedback (Alderson, 2007) for pedagogical support; 2) the engineering task was too difficult, and the complexity of technical content undermined and distorted students' writing performance at the end of their first year of study, indicating that the task would be even more challenging for entering students.In other words, because of the content demands, information appropriate and sufficient for a substantive diagnosis was not elicited by the engineering task.Another issue identified in Phase 2 was that the sudden introduction of both the DELNA generic writing task and the engineering writing task in a communications course was decontextualized.
Taking into account that "students need to develop skills and strategies for communicating in an academic context according to the particular demands of their discipline and those of the profession into which they eventually hope to Vol. 5(2)(2017): 148-171 enter" (Murray, 2010: 352), we proceeded to develop an engineering ESP-based writing task to increase the meaningfulness of the diagnosis and the potential effectiveness of the individualized academic support it would occasion.We asked a professor teaching a first-year introductory engineering course required of all entering students to suggest potential topics for the new task.He recommended the topic of his first lecture, namely, engineering innovations, which address particular problems.In collaboration with the professor, the specific topic of the new engineering ESP-based task was identified.
Having identified the topic, we designed the new writing task informed by the findings of Phases 1 and 2. The task asked students to interpret two graphs that presented information about the selected engineering innovation.To evaluate and refine the new disciplinary ESP-based writing task, we showed a short video on the topic and administered the new task to a group of five engineering students who had previously participated in Phase 2 (2 of these students had responded to the generic DELNA task and 3 to the engineering writing task).In all respects, the new disciplinary (engineering) ESP-based writing task elicited more meaningful and useful writing for the purposes of the diagnostic assessment.2Using an engaging topic of a contemporary engineering innovation and foreshadowing the topic of the new engineering ESP-based writing task by providing students with a video allowed for better contextualization of student writing within engineering practice.As well, the focus group provided enthusiastic feedback on the topic and the students were eager to discuss the graphs with task administrators.
We next presented the findings from the comparison of the three writing tasks (generic DELNA, engineering, and the new disciplinary ESP-based task) to the Faculty of Engineering.In response to our presentation, engineering professors recognized the importance of embedding the diagnostic task in a discipline-specific engineering context.They agreed that the topic of the new engineering ESP-based writing task should be introduced in the first lecture of the required first-year course, along with the video, and that students would be informed that in the follow-up laboratory class, they would be asked to write about the topic.Then, in the first laboratory class, the diagnostic task would be administered to all students enrolled in the course.
Once the procedure was established and approved by the Faculty of Engineering, the new task was administered to the cohort of 1,500 entering firstyear engineering students.The subsequent analysis of the students' responses to the new engineering ESP-based writing task demonstrated that situating it within a required first-year course increased the meaningfulness of the entire diagnostic assessment procedure:  Rather than being an ad-hoc add-on, unrelated to the students' degree program interests, embedding the topic in the mandatory engineering course increased Vol.5(2)(2017): 148-171 its meaningfulness for students.By situating the task within the context of the required engineering course, playing a related video, providing additional information on the topic, and announcing that students would need to write about the topic in their first laboratory class, the engineering professor legitimized the task for the students.They took the task more seriously, and the resulting writing was more interpretable for diagnostic purposes because it was an instantiation of disciplinary engineering practice;  The new procedure incorporated in the diagnosis an academic listening component, arising from the professor's lecture and the video, and an indirect measure of students' comprehension, and academic study/information search skills.For example, well-prepared, academically "savvy" students (Schryer, Lingard, & Spafford, 2005: 234) would typically access additional information in advance of the subsequent laboratory class, because they knew that they would be required to write about the topic;  When students arrived in the laboratory class, they received a diagnostic assessment booklet that refreshed their memory of the lecture, video, and other resources by providing a brief reading about the topic.Students were then asked to compare and interpret two graphs, which represented additional information on the topic.Thus, drawing on the background provided within the context of the lecture, students responded in writing, much as they would do in any engineering academic course.
In sum, findings from Phases 1 and 2 informed the development of a disciplinary ESP-based writing task and analytic rubric.The rubric included specific engineering-related criteria for accurate content, engineering disciplinary rhetorical expectations, and logic.At the same time, the rubric retained useful criteria relating to language (e.g.grammar, spelling) from the generic DELNA rubric (see Fox, von Randow et al., 2016).Incorporating indigenous criteria specific to this disciplinary context increased the usefulness and meaningfulness of feedback from the diagnosis in moving toward individual academic support for the entering undergraduate engineering students considered in this study.

CONCLUSIONS AND IMPLICATIONS
In this paper, we have reported on part of an ongoing longitudinal mixed methods study of the design and implementation of a diagnostic writing assessment task and rubric in an undergraduate engineering program.Contemporary ESP-based approaches consider disciplinary English as a specialized variety of academic English that all students, regardless of their language backgrounds, need to develop in engaging with a discipline (cf.Artemeva & Fox, 2010;Artemeva et al., 1999;Conrad, 2017;Noceti et al., 2017).The disciplinary, ESP approach described here is of particular importance to entering undergraduate students, especially in

Figure 1 .
Figure 1.Example of the DELNA writing task (from the diagnostic test) You have 30 minutes to do this task.You should write between 200 and 250 words (approximately 1½ to 2 pages).All sections are of equal importance.Tourism in New Zealand The graph below shows the number of tourists arriving in New Zealand from 1983 to 2007.Write an academic essay in which you will: • Describe the information given in the graph.THEN • Suggest reasons for the trends.AND Either • Discuss the impact of tourism on the economy and the environment in New Zealand.Or • Discuss the impact of tourism on the economy and environment in your own country.NZ Government Statistics Some interesting facts: 1991: First year of the New Zealand Tourism Board (an organisation to promote tourism in New Zealand) 1997: Asian financial crisis 1999: Clean Green New Zealand promotion by the New Zealand Tourism Board 2004: Fast economic growth in China Vol.5(2)(2017): 148-171 considered at-risk and those scoring above seven considered particularly adept in academic writing.

Figure 2 .
Figure 2. Rater background in relation to Communities of Practice (CoP)


Seven engineering practitioners  one practicing engineer  three engineering professors  two engineering teaching assistants  one fourth-year undergraduate engineering student Vol.5(2)(2017): 148-171 4. MERGING FINDINGS FROM PHASES 1 AND 2: DEVELOPING AN INDIGENOUSLY DRAWN ESP-BASED WRITING TASK AND RUBRIC As noted above, in most contexts, DELNA diagnostic writing tasks are only administered to those students who test below a cut-off on the computer-based screening test.The generic DELNA writing task operationalizes a PELA construct, Vol.5(2)(2017): 148-171

Table 1 .
What repeats across the two rater groups: Decision by CoPs (N=28) , for example, writer number 4 . . .I had a really good impression . . .this is somebody who already understands what graphs are about, and what it is you look for in a graph.Rather than describing every single detail, the main trend is picked up right away and presented in a very appropriate way.Again, trends and reasons for trends are very well presented in the first sentence.This is what I would be interested in if I hadn't seen the graph, as someone who has experienced reading graphs.(emphasis added to reflect the original comment)

Table 2 .
Score decisions by CoP in relation to course outcomes at the end of the first semester