10 Special Project: Developing Computer Science Learning Material for English Language Learning Students
Madhav Ajayamohan
Learning Objectives
In this module, you will be tested on all the skills you learned in the above modules and CSC108 knowledge, and learn more about the concerns English Language Learning (ELL) students face when learning Computer Science. By the end of this module, you will be able to
- Understand the struggles of ELL students learning computer science in universities
- Implement solutions to help ELL students better comprehend computer science concepts
- Brainstorm further solutions to support ELL students in computer science
- Utilizing basic conditional statements
- Utilizing File I/O methods in Python in conjunction with for loops
- Create and access information from dictionaries
- Manipulating strings and textual information
What does Computer Science Look Like for English Language Learning Students?
Have you ever thought about what computer science looks like for people whose first language isn’t English? At first glance, it seems obvious that English Language Learning students will definitely struggle. After all, every mainstream programming language we use in class is written in English.
For ELL students, learning computer science in English is like trying to learn a language in a language that they don’t know very well. As a result, they get overloaded by trying to understand programming concepts and the language they are explaining at the same time.
But, this doesn’t mean ELL students are any less capable than students who native language is English.Studies show that when taught in their native language, ELL students perform just as well as native English students.
If you are interested in this topic, look at some of the papers listed in the reference and the literature reviews referred to at the end– they can give you a lot more insight into the subject.
How do we Help ELL Students Catch up to Native English Speakers?
There isn’t an easy way to do so. While ELL students are disadvantaged by their unfamiliarity with English, it doesn’t change the fact that English is the main language of communication for sciences. In Guo’s (2018) survey, a native English speaker summarized why:
“It would be a massive waste of resources if every non-native English speaking programmer would start to translate documentation, let alone keep it up to date. Since the dawn of the current power of open-source, English – whether someone likes it or not – has become our lingua franca to describe programming languages and tools. That fosters the ability of the programming community to communicate across borders. Likewise with science, we have to find a common ground and pool resources instead of disparaging them in linguistic balcanization (Guo, 2018, p. 8).”
So, how do we help ELL students? One solution presented to effectively teach students English as the language of instruction: using simple English for lectures, explaining specialized vocabulary upon first appearance and speaking slowly during vocal explanations.
This final project will focus on improving lecture transcripts by giving in-dept explanations to specialized vocabulary upon first appearance.
Special Project: Simplifying Lecture Transcripts for ELL Students
The goal of this assignment is to make you consider solutions for how to help ELL students, and help you practice how to combine concepts you have learned over the entirety of CSC108 such as:
- Manipulating Strings
- Reading and Writing to Files
- For Loops
- Utilizing Dictionaries
- Conditionals
Context:
In order to help ELL students, Professor Ajaya wants to make transcripts of his lectures available for students to review concepts after class. However, when he reviews the transcripts, he realizes that he used idioms and expressions that ELL students may not be able to understand.
In order to make the transcripts more accessible for ELL students, Professor Ajaya tasks you with writing a script that can create lecture transcripts with appropriate definitions.
Example
Consider that the original transcript consists of one line:
However, the ELL students may not understand the idiom finding a needle in a haystack. So, you need to produce a new transcript like so:
In essence, you need to insert the definition of the term in the new file
Tools Given to You
In order to complete this task, you have been given:
- A dictionary called IDIOM_TO_DEFINITION that maps vocabulary terms to their definition. From the above example, “control flow” would be a key, while its definition is the value
- A function called test_file_equality that tells you whether two files have the same content or not
- Five files that will help you get started with testing, along what they are supposed to look like
Your Task
For this project, you must:
- Complete the function simplify_line. simplify_line, given a string, checks whether or not the string contains any vocabulary we need to provide definitions for. If there is vocabulary we need to provide a definition for, it returns a new string with the original line, and the definitions. In order to complete this task, consider:
-
- All the expressions you need definitions for is given by the keys of the IDIOM_TO_DEFINITION dictionary
- You need to go through each word in the given line, and detect whether or not the word is a key in IDIOM_TO_DEFINITION.
- You must also consider the fact that IDIOM_TO_DEFINITION has keys that contain for two words– when going through the line, you must also check every pair of words as well
- You should be able to detect an idiom in the list, even if it in a different case from the key in the dictionary
- If you detect a vocabulary word, you need to insert the definition after the line– specifically there must be one empty line in-between the original line and the definition, and one empty line after the definition
- If you detect multiple vocabulary words in the same sentence, then you must insert the definitions for each word in order of appearance.
- For example, if “lists” comes before “dictionaries” then the definition for lists must be added before the definition of dictionaries. Both definitions comes after the original line.
- There should be an empty line in-between the original line and each definition. There should be an empty line after the definition
- Complete the function simplify_transcript. simplify_transcript, given the file name to a txt file (the lecture transcript) writes a new txt file that represents the simplified transcript with additional definitions.
- Every single line on the given txt file represents a sentence of the lecture transcript.
- Make sure to utilize simplify_line in order to complete this task.
- Make sure to test whether your function works with test_file_equality to compare the new file you produce and the solution files.
-
Testing
Consider what types of additional tests you need to do in order to ensure your code works. In particular,
- Which of the functions above can have doctests? Do they need any doctests?
- For this function, is it better to use unittest or pytest for extensive testing?
If you want a referesher on testing, check out this chapter: https://ecampusontario.pressbooks.pub/cscriticalpedagogies/chapter/the-importance-of-utilizing-different-tests/
Done with your assignment? Unsatisfied with the difficulty level? Then consider challenging yourself by enhancing your solution:
- The starter file gives you some idioms– but there are still a lot more students can encounter. How can you enhance the current dictionaru
- Consider the keys of the IDIOM_TO_DEFINITION dictionary. Notice in test5.txt, even though it contains ‘ bang your head against the wall’ we don’t replace it with the definition– because they key is “banging your head against the wall.” Consider modifying the keys so that no matter which instance the idiom is used, you can replace the key?
- Modify your solution so that after you provide the define an idiom once, you never add the definition for that idiom again.
- Modify your solution so that after you provide the definition of an idiom K times, you never add the definition for that idiom again.
- Let’s K = 2. So if the idiom “a piece of cake” occurs five times in a transcript, you provide the definition for the first 2 occurrences, and the never again
- Modify your solution so after defining an idiom once, you don’t define it again until you have detected the idiom two more times
- Let K = 2. So So if the idiom “a piece of cake” occurs seven times in a transcript, you provide the definition for the first first time– then you don’t provide the definition the second and third time. However, you provide the definition at the fourth occurrence. You don’t provide the definition for the fifth and sixth occurence, and provide it once again for the seventh.
References
- Grover, S., Raman, A., Banati, H., Goel, N., Babu, C., & Karkare, A. (2023). Does Bilingual Specification impact students’ comprehension of problems in Introductory Programming? ACM International Conference Proceeding Series, 66–71. https://doi.org/10.1145/3627217.3627237
- Guo, P. J. (2018). Non-Native English Speakers Learning Computer Programming: Barriers, Desires, and Design Opportunities. Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, 1–14. https://doi.org/10.1145/3173574.3173970
- Rauchas, S., Rosman, B., Konidaris, G., & Sanders, I. (2006). Language performance at high school and success in first year computer science. Proceedings of the Thirty-Seventh SIGCSE Technical Symposium on Computer Science Education : SIGCSE 2006 : Houston, Tex., USA, March 1-5, 2006, 38(1), 398–402. https://doi.org/10.1145/1124706.1121467
- Pal, Y., & Iyer, S. (2015). Classroom Versus Screencast for Native Language Learners: Effect of Medium of Instruction on Knowledge of Programming. Annual Conference on Innovation and Technology in Computer Science Education, ITiCSE, 2015-, 290–295. https://doi.org/10.1145/2729094.2742618.
- Portnoff, S. R. (2018). The introductory computer programming course is first and foremost a language course. In ACM Inroads (Vol. 9, Number 2, pp. 34–52). ACM. https://doi.org/10.1145/3152433.
- Siegmund, J., Kästner, C., Apel, S., Parnin, C., Bethmann, A., Leich, T., Saake, G., & Brechmann, A. (2014). Understanding source code with functional magnetic resonance imaging. Proceedings – International Conference on Software Engineering, 1, 378–389. https://doi.org/10.1145/2568225.2568252.
Extra Reading
If you delve into the topic a bit more, you can find out there are actually quite a few programming languages that are written in a different language– a brief look into this Wikipedia page will show you many of them. While there is an impressive number of programming languages in different languages, its obvious most of them aren’t mainstream. Some of them, like Malluscript, are small scale projects that are trying to introduce programming to more diverse perspectives.
If you want to learn more about programming instruction for ELL students, these two papers summarize current research on the topic:
- Becker, B.A. (2019). Parlez-vous Java? Bonjour La Monde != Hello World: Barriers to Programming Language Acquisition for Non-Native English Speakers. Annual Workshop of the Psychology of Programming Interest Group.
- Lei, Y., & Allen, M. (2022). English Language Learners in Computer Science Education: A Scoping Review. Proceedings of the 53rd ACM Technical Symposium on Computer Science Education – Volume 1, 1, 57–63. https://doi.org/10.1145/3478431.3499299