Developing Online Learning Assessment Instrument for English Sentence Structure Course during Covid-19 Pandemic

the test was valid and reliable, giving it accessible portion for use in the English department. Aside from that, students also opined positively toward the use of the test in measuring their English grammar mastery. Despite these, we found that students’ score in the tryout phase is low affected by their lack of test preparation, inappropriate situated test time, and ineffective teaching and learning enactment. The paper ends with recommendation for future researchers.


Introduction
In recent years, language assessment research has been a pervasive issue among scholars (see Furaidah et al., 2015;Galikya et al., 2019;Johnson & Shaw, 2019;Xu & Liu, 2019). Particularly, with the inclusion of Covid-19 pandemic where teaching and learning in higher education worldwide switches to online mode, the online assessment is no doubt a fundamental process that should be incorporated by teachers. Geared by this reason, research on online assessment directed toward learners of English as a foreign language (EFL) is interesting to do.
Technological advancement has been very supportive for online English learning in tertiary levels. During the pandemic, technology plays a very important role in supporting the smooth running of distance learning in the emergency areas (Amin, 2020;Nguyen & Linh, 2021). Assessments during the Covid-19 period have also changed from all paper and pencil tests, now turning into online assessments or assessments. Empirically speaking, online assessment provides various advantages than that of the traditional paper-and-pencil testing formats. First, online evaluation provides for variable testing cycles during which the test can be administered (Spivey & McMillan, 2014). Second, internet tools can be configured to automatically randomize the order of questions as well as the set of answers for multiple-choice and matching questions. Third, varying degrees of input can be given, such as test score, test score with correct responses, or test score with comprehensive solutions. In addition, teachers have control over when feedback is given (e.g., immediately, set to a particular date and time after all questions are completed). Fifth, it is possible to set up preferred online testing systems to include clues or prompts as to where the text or course notes will find support to answer a question.
Furthermore, Spivey and McMillan (2014) also assert that neither study efforts nor course performance was influenced by the testing procedure. However, the researchers found a strong positive relationship between students' effort and their performance in the course. In the same vein, a study by Mohamadi (2018) uncovered that using and engaging technology and techniques along with appropriate assessment strategies is a powerful way of making learning efficient. Furthermore, Fu (2013) revealed that integrating ICT into education setting brings some benefits, such as (1 ) assisting students in accessing digital information efficiently and effectively, (2) supporting student-centered and self-directed learning, (3) producing a creative learning environment, (4) promoting collaborative learning in a distance-learning environment, (5) offering more opportunities to develop critical (higher-order) thinking skills, (6) improving teaching and learning quality and (7) supporting teaching by facilitating access to course content.
Due to the Covid-19 pandemic, all teaching and learning activities in Indonesian universities are held online. In its implementation, online assessment is chosen to assess students' competence. Situated in a public university in East Java, Indonesia, we observed that grammar course in the English department has enacted an English Sentence Structure (ESS) course. Empirically speaking, , the current online assessment method in the English Sentence Structure course still does not meet the expectation especially in terms of how the assessment can really assure the test takers' actual performance. Lack of supervisory control when doing the test is one of the important factor affecting students' tests results resulting in the inability of the test to reflect the actual competence of the students. Teachers still can see the difference in students' everyday performance for online test is very different from classroom test where test supervisors can keep an eye to the test takers when they are doing a test.
To solve such existing problems, developed an online test platform for ESS course which can help students to keep honest when doing it. Therefore, it is important to design a meaningful English Sentence Structure assessment instrument is to integrate the value of character education like honesty. Character education here has a higher meaning than moral education because it not only teaches what is right and what is wrong but also teaches about habits about good things so that students understand what is good and what is not (pedagogic domain) and can feel (affective domain) and can do it in real (psychomotor domain). Students in higher education need information and abilities to advance their academic results. However, without adequate character, students may achieve their academic goals in an unethical manner. The understanding, talents and character two interdependent domains that students must possess in order to be successful productive, energetic, and effective at school and in society. Therefore, to accomplish it, character building is essential for them (Berkowitz, 2011).
The content of the value of honesty is emphasized in the aspect of characterbuilding development here considering that during the Covid-19 pandemic, the aspect of honesty is very prioritized in the online assessment model because one of them is the lack of supervision when conducting assessments.
With regard to the notion of character building, this study focuses on building honesty character where this value serves as the main subject. The integration of the value of honesty in the assessment instrument in the English Sentence Structure (ESS) course will make the assessment in this course complete because not only the pedagogical aspects are emphasized but also the assessment of other aspects. Honestyladen character education which is integrated into the design of developing assessment instruments is very much needed during the online learning period in the current COVID-19 pandemic era or in other emergencies because internalizing the value of honesty character education in this online assessment instrument will equip students with positive things about how important it is. Having the character of honesty that can be applied when they face assessments in various formats.
There have been quite numerous previous researches on online assessment in relation to grammar class concerning on the effect of online platform on students' grammar proficiency (e.g., Yusof and Saadon, 2012;Nguyen and Linh, 2021;Windsor, 2021); students' perception on the use of online test platform (Dembsey, 2017;Jazil et al., 2021). However, currently, studies regarding the development of online test instrument that is able to accommodate honesty-based character education is still unexplored. Therefore, there is a need to develop a design model of an online assessment instrument by containing aspects of honesty character education which are very suitable and useful in online learning such as during the current pandemic.
Researchers have designed a model for developing an online assessment instrument that can be used by English teachers in higher education units, especially those who teach English Sentence Structure courses. This study seeks to develop a blueprint for an online grammar learning assessment instrument (English Sentence Structure) in the pandemic era where the final product of this research produces a blueprint or an online assessment development design in the pandemic era that can be used as a grammar teaching guide.
The test format utilized in this test is a multiple-choice test. Multiple-choice tests are widely used in testing and evaluation. Multiple choice testing has been found in studies on the measurement of grammar competence to have an important place due to its merit of guaranteeing the fulfilment of the content validity of achievement tests (Adisutrisno, 2008). In addition, the multiple-choice format was found to yield equivalent reliability and validity in a shorter amount of test-taking time (Bacon, 2003). Time is very much important to consider as the test is conducted online during pandemic so MC is regarded as the most appropriate form of test. Last, multiple choice is believed to meet the demand of the large number of test takers, the need for fast scoring, and the convenience and reliability of multiple-choice tests (David, 2007;Currie & Chiramanee, 2010).
This study is critical in circumstances where new types of evaluation arrangements are being implemented with regard to the emergency teaching and learning during Covid-19 pandemic. Thus, this study attempted to address a gap left by earlier studies by providing an optimum sort of online English Sentence Structure test. This study focused on the effort to develop an English Sentence Structure test for university-level English department students.
To meet these needs, an online assessment platform is needed that is able to accommodate these characteristics of grammar test for the second semester students of English language and Literature department especially learning language components namely grammar II or what the so-called English Sentence Structure (ESS). The name of the course from university to other university may be different as other university may name it as English grammar II. This means that the development of this online test can also be used in other universities having the same subject. Learning English Sentence Structure which is the initial stage of learning English grammar is very necessary to support the success of other English learning because learning grammar is the basis for achieving success for the next stage of language learning.

Method
The primary purpose of this study was to develop an English Sentence Structure (ESS) test as an assessment product for Indonesian EFL learners studying grammar in the second semester. To enact such a purpose, Design-based Research (DBR) informed by Cavallaro and Sembiante (2021) was employed in this study. DBR is sued to understand learning from a design of a product, where theory and practice cannot document comprehensively. Four phases are enacted in DBR. The first phase is Pre-Implementation, which aims at familiarizing with practices, norms, and behaviors of the studied settings. In Phase 2, Design, lessons or instructional practices are created with regard to the students' learning. Phase 3, Implementation & Revision, evaluate the lessons. Phase 4, Reflection & Evaluation, measures the overall study results and reflects the practices (Cavallaro & Sembiante, 2021).
The purpose of English Sentence Structure course is to provide EFL students of the second semester with a variety of sentence patterns by evaluating the relationship of ideas in a single sentence. In addition, students are expected to examine the faults in a short text conversation. This course focuses on students' ability to use more complicated phrase patterns in context. Thus, the task carried out is related to the identification and analysis of sentences in English texts with various themes and writing various sentence patterns with correct and good language rules. Meanwhile, the instructional objectives are to meet the following objectives, namely a) distinguish between question and noun clause, b) make direct and indirect speech, c) make conditional clause, and d) decide suitable conjunctions.
Based on the goal and instructional objectives the criteria/indicators of ESS, the formulation of ESS blue print is shown in Table 1.
The researchers developed the ESS test as the product based on the blueprint. The objective test consists of 50 items and four options. The ESS test scores are 2 for right answers and 0 for incorrect answers. Each correct response is awarded one point, while each erroneous response is worth none. First and foremost, a preliminary study was carried out to establish the significance of developing an ESS test. It was then followed by the development of the ESS test, which followed several procedures such as a) developing the test by referring to the course description, b) identifying the goal based on the course description, formulating the instructional objectives/indicators/course learning outcomes, c) developing the test item, d) checking and rechecking the test items, and e) validating the test items (expert validation).
Following that, an online try-out test via zoom was conducted through the test platform https://fib.ub.ac.id/ESS/. In the try out phase, 100 students were recruited as student participants, and five lecturers teaching ESS courses engaged in the validation process during the peer debriefing stage. Student participants in this study enrolled in the English department of a public university in Malang with the Academic Year of 2020/2021. Lastly, the difficulty level of the test items (multiple choice) as well as the dependability level were assessed. Finally, the product was tested for its effectiveness. The figure 1 depicts the procedures.
The researchers also examined the results from the ESS using validity and reliability tests. The researchers employed item validity to determine the validity test. The researchers assessed the validity of each item on the ESS test in the item validity test using a method known as Point-Biserial Correlation. Point-Biserial Correlation is used to assess students' ability to complete a test, particularly a grammar test, as well as the validity of the test itself. Aside from using data validity, the researchers also used item difficulty/facility and item discrimination. The dependability of the test was subsequently be assessed. The test was used to determine the product's reliability. In this case, the test should yield equivalent results (Brown, 2004). The Kuder Richardson-20 algorithm was used by the researchers to determine the dependability of test.

Findings
The present study develops an English Sentence Structure (ESS) for Indonesian EFL learners. The findings elaborate on what the researchers did during the investigation in a systematic manner using the aforementioned processes. The findings are concerned with the detailed descriptions of the product, validity and reliability aspects of the product, and the effectiveness of the product.

The Difficulty Level of Test Item
The assessment of the difficulty level of the questions is based on Arikunto (1999), who provided the following formula for estimating the difficulty level of the questions: P=B/J P is the difficulty index, B is the number of correct answers, and Jk is the total number of participants who took the test. The difficulty index is classified in the table 2: The results of the calculation of the difficulty index are presented in the table 3.  5,6,11,13,16,19,22 ,23 ,24,25,26,28,29,31,32,33,43,44,45,46 Easy (P > 0,70) 19 38.0 2, 3,4,7,8,9,10,14,15,17,18,21,34,35,36,37,38,39,40 According to Table 2, the bulk of the items have a moderate level of difficulty. A total of 21 questions, or 42 percent of all questions, have a moderate level of difficulty. Meanwhile, there are 19 questions with an easy difficulty level and the remaining 10 questions with a high difficulty level. The following examples (table 4) showcase each examples of the test items representing each level of difficulty.

Validity Test of Test Item
The researcher utilizes biserial correlation statistics (r Pbi) to verify the validity of the question items, with the criteria that if the value of rPbi > rtable (0,48), the test item is pronounced valid. The results of the item validity test are shown in the table 5.  Based on table 5, the biserial correlation value on all items is more than 0.197. These results conclude that all items have met the validity requirements.

Reliability Test of Test Item
The KR-20 statistic was employed in this study's reliability test. The KR-20 test criterion states that if the KR-20 statistic value is more than 20, the item is dependable. The reliability test results are shown in the table 6. Based on table 6, the KR-20 value is 0,873. The value is more than 0.70. This means that all item questions have met the reliability requirements.

The Effectiveness of ESS Test
The students were required to respond to the instructional materials from the aspect of effectiveness of the ESS test to help them learning English grammar. Based on the results of the analysis, it was found that the students' responses were positive. The complete results are recorded in Table 7. Central to this study is students' positive responses toward the product. Findings show that the students opined positively regarding the effectiveness of ESS test. According to the results of the questionnaire, the test was demonstrated to assist them understand the materials taught in the ESS course. In particular, 88 respondents stated that the test is beneficial in assisting them in grasping the ESS materials in the online learning. They stated that the test is good because it is accompanied by an answer key that appears immediately after they complete the test. This is highly useful because students can immediately learn the root causes of their errors. Following that, based on the questionnaire findings, it was revealed that the exam is adequate in terms of time allocation, enabling them ample time to consider the answer.

Discussion
This study develops an online assessment in the form of English Sentence Structure (ESS) for Indonesian EFL learners. The result of the study showed that the multiple choice ESS test developed from platform https://fib.ub.ac.id/ESS/was reliable and valid. The consistency and difficulty of distractors plays an important role in items that test for misconceptions and careful thinking. In developing ESS test items, the researchers integrated character building values of honesty in all the test items or in other words, all items of questions developed has the value of character building honesty so that it is expected that the students can do the test honestly which later may enhance the reliability of the test result. Honesty as a valued-mediated domain in test items is integrated in order students virtuously score their learning through the online assessment (Djiwandono, 2016;Leichsenring, 2020). Surprisingly, although the test is stated to be authentic and dependable, the result score in the try out session is not what was expected, since some students received poor marks. This finding validates the conception that English grammar is difficult to master, despite the fact that a number of teaching approaches to supporting grammar learning has been carried out (Ismail et al., 2010;Saengboon et al., 2017). However, there are some possibilities regarding the students' low score in the ESS test in this study.
First, it deals with the limited preparation students have (Ellis & Ryan, 2003) before taking the test as they knew that the test they were doing was done during the effective semester time span. Consequently, students might not be that serious in doing it for they knew that it was just a test not for their score as the semester has just ended (Mukminin et al., 2012). There were also 100 students involved in the try out test. This number is quite representative in giving broader description on the phenomenon under study (Nishitani, 2007). However, further field trial involving more test takers is needed to be conducted in situation which the test takers are fully aware that they are being assessed for their ability within semester program, for example in the final test. Having more test takers will give more possibility of having more various test results (Qian, 2009).
Second possibility deals with the teaching implementation of ESS course. Students' low score despite the availability of valid and reliable test may also cause problems during teaching and learning process of ESS in the classroom. Students might get the difficulty in receiving the knowledge from the teacher resulting in their lack of understanding when exposed to ESS test. Albeit prior studies have portrayed the effectiveness of specific teaching strategies on students' grammar knowledge development (Albahuoth, 2020;Ismail, 2010;Safford, 2016;Valizadeh & Soltanpour, 2020), in the present study, we suggest situated and nuanced teaching approaches be enacted to help students develop their grammar repertoire.
Students' positive responses toward the ESS test is also a salient actor in test construction. Theoretically, test that provides students' needs are more likely to be preferred by the students in teaching and learning process (Rezaee, Alavi, & Razzaghifard, 2020). Furthermore, the test platform is simple to use, and it includes a sound that indicates if the students are making mistakes or not. Finally, the exam items are all designed to teach the principle of honesty, which means that the most important thing is not the correct answer they get, but how honestly, they take the test. During the pandemic, the character of honesty is particularly important in the online learning test because pupils will have no supervisor once they complete the test. The test results will be determined only by the level of honesty demonstrated by the participants.

Conclusion
The present study has sought to develop an online learning assessment in the form an English Sentence Structure (ESS) test for Indonesian EFL students. The development of the product was applied based on series of test development procedure made by the researchers which consist of sequential stages starting from identifying the goal based on the course description, formulating the instructional objectives/indicators/course learning outcomes, developing the test item, checking and rechecking the test items, conducting try out test, checking the result, and checking the difficulty level of the test items and the reliability level. Based on the analysis, the biserial correlation value on all items is greater than 0.48, indicating that all items meet the validity requirements. The investigation also revealed that the KR-20 value is 0.873, which is more than 0.70. This signifies that all item queries meet the criterion for dependability. As a result, it can be argued that a ESS test is appropriate for use in the ESS class, and in terms of test difficulty, it was discovered that the majority of the items have a moderate level of difficulty. There are 10 difficulty questions, 21 test items with moderate level of difficulty, and 19 questions with a declared easy. This signifies that the test is suitable for use. Additionally, our study also unpacks that the product does not need to be updated as it has met the validity and reliability standards, and the test is also proven to be effective in helping the students learning English Sentence Structure (ESS) well. In spite of the fact that the test is valid and reliable, the try out still show unsatisfactory results concerning students' score which might be caused by students' unprepared condition since they are not in the semester range which causes them not to do the test at their best. Therefore, future researchers should undertake field testing within a semester range for example design this online ESS test as a final test during semester time span as a final test to acquire a more complete and clear understanding of the students' proficiency in doing this ESS test. This can be used to address the limitation of the study.