1. Introduction
The advent of state-of-the-art (SOTA) generative AI chatbots like ChatGPT presents both opportunities and challenges for education, particularly in language learning. This paper investigates how English as a Foreign Language (EFL) secondary school students, as novice users, engage in prompt engineering—the skill of crafting instructions for AI—to complete a writing task. The core problem is that effective collaboration with ChatGPT is not intuitive; it requires a learned skill that many students lack, leading to inefficient trial-and-error processes. This study aims to map the diverse pathways students take, analyzing the content, quality, and evolution of their prompts to inform pedagogical strategies for integrating AI literacy into the EFL writing classroom.
2. Methodology
This research employs a qualitative case study approach. Data was collected from iPad screen recordings of secondary school EFL students in Hong Kong using ChatGPT and similar SOTA chatbots for the first time to complete a standardized writing task. The analysis focused on a detailed examination of the prompts students generated, their sequences (pathways), and the corresponding AI outputs. The study identified four distinct archetypal pathways based on patterns of interaction, prompt sophistication, and strategic approach.
3. Case Studies: Four Prompt Engineering Pathways
The analysis revealed four primary interaction patterns, representing different levels of engagement and strategic thinking.
3.1. Pathway A: The Minimalist
Students in this pathway used very few, often vague, prompts (e.g., "Write an essay about pollution"). They exhibited low metacognitive engagement, accepting the AI's first output with minimal revision or specification. This pathway highlights a fundamental lack of understanding of the AI's capabilities and the need for precise instruction.
3.2. Pathway B: The Iterative Refiner
These students started with a basic prompt but engaged in a sequential refinement process. Based on the AI's initial output, they issued follow-up commands like "make it longer," "use simpler words," or "add an example." This pathway shows an emerging understanding of the interactive and iterative nature of human-AI collaboration.
3.3. Pathway C: The Structured Planner
A more advanced pathway where students attempted to structure the task for the AI from the outset. Prompts included elements like role-playing ("You are a writing tutor"), step-by-step instructions ("First, give me three ideas. Then, outline the first idea"), and explicit constraints ("Write 150 words using the past tense"). This approach demonstrates strategic planning and a clearer model of how to "program" the AI through language.
3.4. Pathway D: The Exploratory Tester
These students used a high volume of diverse, often experimental prompts. They tested the AI's boundaries with creative, off-topic, or complex requests to understand its functionality before applying it to the core task. This pathway reflects a exploratory, tech-savvy mindset but may not always efficiently lead to the task goal.
4. Results & Analysis
4.1. Prompt Quality & Quantity Patterns
A clear correlation was observed between prompt sophistication and final output quality. Pathway C (Structured Planner) consistently yielded the most coherent, task-appropriate, and linguically rich texts. Pathway A (Minimalist) outputs were generic and often off-target. The quantity of prompts alone (high in Pathway D) did not guarantee quality; strategic quality (Pathway C) was the key differentiator.
Prompt Interaction Summary
- Pathway A (Minimalist): Avg. 2-3 prompts; Low specificity.
- Pathway B (Iterative Refiner): Avg. 5-8 prompts; Reactive refinement.
- Pathway C (Structured Planner): Avg. 4-6 prompts; High pre-planning.
- Pathway D (Exploratory Tester): Avg. 10+ prompts; High variety, mixed relevance.
4.2. Impact on Writing Output
The final writing products varied significantly. Structured prompts led to outputs that better addressed task requirements, used more appropriate vocabulary, and demonstrated clearer organization. Minimalist prompts resulted in texts that, while grammatically correct, lacked depth and personalization, resembling generic web content.
5. Discussion: Implications for AI Literacy Education
The study underscores that using ChatGPT effectively is a learned skill, not an innate ability. The prevalence of minimalist and inefficient iterative pathways among novices signals a critical gap in current education. The authors argue for explicit prompt engineering education to be integrated into EFL curricula. This would move students beyond trial-and-error, equipping them with frameworks to formulate clear instructions, assign roles, specify formats, and iteratively refine outputs—transforming the AI from a black-box oracle into a collaborative tool.
Key Insights
- Prompt engineering is a new form of digital literacy essential for the AI age.
- Student approaches to AI are heterogeneous, requiring differentiated instruction.
- Quality of instruction (prompt) directly dictates quality of AI-assisted output.
- Without guidance, students risk developing passive or inefficient interaction habits with AI.
6. Technical Framework & Analysis
From a technical perspective, prompt engineering interacts with the underlying language model's probability functions. A well-crafted prompt $P$ guides the model $M$ to sample from a more constrained and desirable region of its output distribution $D$ for a given context $C$. The process can be abstractly represented as maximizing the conditional probability of a desired output sequence $O$:
$O^* = \arg\max_{O} P(O | C, P, M)$
Where a vague prompt increases entropy in $D$, leading to generic outputs, a specific prompt with constraints (role, format, style) reduces entropy, steering $M$ towards a more targeted $O^*$. The students' pathways effectively represent different strategies for manipulating this conditional probability through natural language instructions.
Analysis Framework Example
Scenario: A student wants ChatGPT to help write a persuasive paragraph about recycling.
- Weak Prompt (High Entropy): "Write about recycling."
Analysis: The model has minimal constraints, likely generating a broad, encyclopedia-style overview. - Strong Prompt (Low Entropy): "Act as an environmental advocate. Write a persuasive 80-word paragraph aimed at teenagers, convincing them to recycle plastic bottles. Use a direct and urgent tone, and include one statistic."
Analysis: This prompt specifies role (advocate), audience (teenagers), goal (persuade), content focus (plastic bottles), length (80 words), tone (direct, urgent), and element (statistic). It dramatically narrows the model's output distribution.
7. Future Applications & Research Directions
The findings open several avenues for future work:
- Adaptive Prompting Tutors: Development of AI-powered tutors that analyze a student's prompt and provide real-time feedback on how to improve it (e.g., "Try specifying your audience").
- Longitudinal Studies: Tracking how students' prompt engineering skills evolve over time with and without formal instruction.
- Cross-Cultural & Linguistic Comparisons: Investigating if prompt engineering strategies differ across languages and cultural educational contexts.
- Integration with Writing Pedagogy: Research on how prompt engineering frameworks can be woven into existing writing process models (pre-writing, drafting, revising).
- Ethical & Critical Dimensions: Expanding AI literacy beyond efficiency to include critical evaluation of AI outputs, bias detection, and ethical use.
8. References
- Woo, D. J., Guo, K., & Susanto, H. (2023). Cases of EFL Secondary Students’ Prompt Engineering Pathways to Complete a Writing Task with ChatGPT. Manuscript in preparation.
- Caldarini, G., Jaf, S., & McGarry, K. (2022). A Literature Survey of Recent Advances in Chatbots. Information, 13(1), 41.
- Long, D., & Magerko, B. (2020). What is AI Literacy? Competencies and Design Considerations. Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, 1–16.
- OpenAI. (2023). GPT-4 Technical Report. arXiv preprint arXiv:2303.08774.
- Zhao, W. X., et al. (2023). A Survey of Large Language Models. arXiv preprint arXiv:2303.18223.
- The Stanford Center for AI Safety. (n.d.). AI Literacy. Retrieved from https://aisafety.stanford.edu/ai-literacy
Analyst's Perspective: Deconstructing the Prompt Engineering Imperative
Core Insight: This study isn't just about students and ChatGPT; it's a microcosm of the fundamental human-AI interaction challenge in the post-ChatGPT era. The core insight is that "prompting" is the new programming. The four pathways (Minimalist, Iterative Refiner, Structured Planner, Exploratory Tester) aren't merely learning styles; they are prototypes of user archetypes that will define productivity and creativity gaps in the AI-augmented workforce. The paper correctly identifies that without structured education, most users will default to the inefficient Minimalist or trial-and-error Iterative pathways, leaving vast potential of tools like GPT-4, as detailed in its technical report, untapped.
Logical Flow & Strengths: The paper's strength lies in its grounded, empirical approach. By using screen recordings, it captures the raw, unfiltered struggle of the novice. This moves the discourse beyond theoretical frameworks of AI literacy (like those from Long & Magerko) into observable practice. The identification of the Structured Planner as the high-performing pathway is crucial. It validates the industry hypothesis that effective prompting resembles a specification document—clear, constrained, and contextualized. This aligns with research on how large language models (LLMs) function as "stochastic parrots" guided by conditional probability distributions; a precise prompt mathematically narrows the output space, as discussed in comprehensive surveys like that by Zhao et al.
Flaws & Blind Spots: The study's primary flaw is its limited scope—a single task with first-time users. It doesn't show if the Exploratory Tester, arguably demonstrating the highest intrinsic curiosity and system exploration, might develop into the most proficient user over time. Furthermore, it sidesteps the critical ethical and critical literacy dimension. A student might be a brilliant Structured Planner, producing a flawless, persuasive essay with ChatGPT, but remain completely uncritical of the biases, factual inaccuracies, or lack of original thought in the output. As institutions like the Stanford Center for AI Safety emphasize, true AI literacy must encompass evaluation, not just generation.
Actionable Insights: For educators and policymakers, the takeaway is non-negotiable: Prompt engineering must be a core, assessed component of digital literacy curricula, starting now. This isn't optional. The study provides a blueprint: move students from being passive consumers of AI output (Minimalist) to active, strategic directors (Structured Planner). Lesson plans should explicitly teach prompt frameworks—role, audience, format, tone, examples (RAFTE). For tech developers, the insight is to build "prompt scaffolding" directly into educational interfaces—interactive templates, suggestion engines, and metacognitive prompts that ask users, "Have you considered specifying...?" The future belongs not to those who can use AI, but to those who can command it with precision and criticality.