Assessing Working Memory
- eemmily7
- Dec 3, 2025
- 8 min read
Updated: Dec 12, 2025

Working memory (WM) is frequently assessed by both speech-language pathologists (SLPs) and psychologists with a variety of testing protocols including standardized assessments, rating scales, observation techniques, and dynamic assessments.
Due to the interdependence of WM and other executive functions, assessments will be imperfect indicators of solely WM (Diamond, 2020). Therefore, SLPs need to consider the student's vision, hearing, speed of processing, attention, fatigue, experience of stress, not understanding what is being asked, or other factors that may affect performance during assessments.
Standardized Assessments
What tasks measure working memory?
Visuospatial: These types of tasks measure visual or spatial WM. There is no significant difference between simple and complex tasks, or when tasks require recall above individual's measured digit span, most likely because there is less ability to 'chunk' this information compared to phonological tasks (Archibald, 2018).
Location span: The child selects the same location of objects after presentation which measures the ability to remember the positions of items in the space.
Visual span: The child must copy the pattern presented previously which measures the ability to recall visual patterns or shapes.
Phonological: Verbal tasks are more numerous and variable. These tasks often use digits because they are lexical labels for a concept (verbal) and they accessing digits is more automatic than words, therefore placing lower demands on WM (Archibald, 2018). We can also use word list retrieval to measure phonological WM. However, these tasks could reflect lack of word knowledge or failure to capitalize on chunking strategies, therefore SLPs often use nonsense word lists. Even within nonsense word lists, there can still be sublexical and phonotactic overlap with existing knowledge, so SLPs should be compared with word knowledge, word retrieval, and word list recall tasks (Archibald, 2018).
Forward digit span: The child repeats a randomized digit list that gradually increases in number which measures short-term memory because it only requires holding numbers in the mind.
Backward digit span: The child repeats a randomized list of digits in reverse order from what they have been presented which measures WM because includes manipulation.
Counting span: The child counts or performs simple operations and then recalls sequence of numbers, letters, or other items.
Running span: The child hears a long list of items with an undetermined end point and is then asked to recall a certain number of items from the end of the list when cued. This predicts higher order cognition in children and eliminates participants knowing when to expect recall event to occur which may create binding opportunities (e.g., rehearsals) that will produce inaccuracy in test score (Archibald, 2018).
Nonword repetition: The child listens carefully to nonwords and then accurately reproduces them which measures the capacity to temporarily store and reproduce unfamiliar sound sequences.
Embedded WM Tasks in Language Assessments
Competing Language Processing Task: The child hears short sentences and judges if they are true/false, while also remembering and repeating the last word of each sentence after several sentences. This is considered to be a good estimate of memory capacity and is used within research (Boudreau & Costanza-Smith, 2011).
Sentence Repetition: The child listens to sentences of increasing length and complexity and then repeats the sentence exactly as they heard it. This requires holding syntactic and semantic information while producing it back accurately.
Following Directions: The child hears multi-step directions and they must carry out the steps in order.
Story Retell: The child listens to a short narrative, holds the information in their mind while also organizing key ideas to produce a coherent retell. This involves integrating vocabulary, grammar, and narrative structure with sequencing and retrieving details.
Formulated Sentences: The child hears or sees a target word and must generate a grammatically correct sentence with that word. This involves holding the lexical target while planning sentence structure, retrieving vocabulary, and monitoring output.
Narrative Comprehension: The child must hold details in the mind, link them to background knowledge with long-term memory, track characters and events, and infer meaning. Many comprehension questions require retrieving earlier information, integrating concepts across sentences, and inhibiting any interference.
Comprehensive Assessments
Children's Memory Scale: A comprehensive assessment of learning and memory for ages 5-16 assessing verbal and visual WM, and attention (Cohen, 2010). After administration of core subtests, the clinician can derive eight index scores: Attention/Concentration, Verbal Immediate, Verbal Delayed, Delayed Recognition, Verbal Immediate, Visual Delayed, Learning, General Memory. The test was standardized on a US sample of 1000 children and has high inter-rater reliability, and adequate test-retest coefficients.
Test of Integrated Language & Literacy Skills (TILLS): A comprehensive, norm-referenced test for ages 6-18 used to (1) identify language/literacy disorders, (2) document patterns of strengths & weaknesses, (3) track changes in language and literacy skills over time (Nelson, 2014). The TILLS is composed of 15 subtests with sensitivity ranging from 81-97% and specificity ranging from 81-100%. Although all subtests will have some capacity to test WM, the most direct include nonword repetition, following directions, and backward digit span. Nonword repetition can be helpful to compare to nonword spelling and nonword reading. Furthermore, backward digit span can be helpful to compare to forward digit span (i.e., working vs. short-term memory).
Comprehensive Assessment Battery for Children-Working Memory: A computer-based battery used to assess WM with over 13 tasks incorporating central executive, visuospatial, phonological, and binding systems (Cabbage et al., 2017). The central executive tasks are unique and include N-back auditory and visual tasks in which the child needs to determine if the tone or game piece is the same as the last presented. The binding systems subtests are also interesting and include paired pieces of visual information, auditory nonwords & shapes, and non-speech sounds & nonwords. As of December 2025, this tool is for research-only and is not open to the public.
Clinical Evaluation of Language Fundamentals-5 (CELF-5): A comprehensive battery that assesses morphology, semantics, syntax, and pragmatics for students ages 5-21 (Secord et al., 2013). It includes 16 standalone tests with the most relevant to WM being nonword repetition, following directions, recalling sentences, word structure, and formulated sentences (Pham, 2021). The most common cutoff of 1.5 standard deviation has a sensitivity of 74% and a specificity of 93% (Secord et al., 2013).
Token Test for Children- 2nd edition: The Shortened Token Test is a standardized test used to assess auditory language comprehension in children which may help SLPs differentiate WM and language skills (Pham, 2021). Within this task, children carry out tasks with increasing length and complexity. During a modified version of this test, the simplest tasks measured basic attention (e.g., single-step commands- "touch the red square"), the lengthier tasks measured WM (e.g., multi-step commands- "touch the red square and then the yellow circle"), and the complex and lengthy tasks measured linguistic ability (e.g., complex directions- "touch the big yellow square and then pick up the small blue circle") (Pham, 2021).
Parent/Educator Rating Scales
Parent and educator rating scales have the potential to being related to real life, however they can be vulnerable to diverse biases that should be considered (Diamond, 2020). Specifically, parent or educator scales may represent the evaluator's reflection about observations and may not be objective (Ward & Jacobsen, 2014).

Observational Rating Scale from CELF-5 (ORS): Documents a student's ability to manage classroom behaviours to meet school curriculum and follow educator instructions via educator, parent, or student report (Secord et al., 2013). The ORS can help obtain a realistic view of student's everyday performance, analyze communication difficulties, identify strengths and interests, establish a plan for further assessment and intervention.
Behavioural Rating Inventory of Executive Function (BRIEF): Parent and teacher questionnaires for ages 5-18 that assess executive function and self-regulation (Gioia & Isquith, 2013). It includes 8 clinical scales including inhibition, shifting, emotional control, initiation, WM, planning, organization of materials, and monitoring abilities. The BRIEF includes clinical comparison groups, which are primarily children with ADHD, but also includes other clinical populations. Therefore clinicians may use this tool to create profiles and patterns, or use caution when interpreting with DLD populations because normative comparisons are less robust.
Observation Techniques
STOP Tool (Ward & Jacbobsen, 2014): Executive function is a collection of multiple cognitive skills that function in coordinated ways. The ability to stop and direct oneself is strongly associated with situational awareness skills (knowing what's going on) which requires extracting information from the environment, integrated this information with internal knowledge, directing further exploration, and anticipating future events to shift flexibility. The STOP Tool is an observational tool to measure student performance in situations to obtain a pattern of strengths and weaknesses that guide intervention and progress monitoring. This tool requires minimal forming training, scoring errors are minimized, items are comprehensive and can be flexible with ability level.
Space | Time | Objects | People | |
Extracts | Understands what's going on. | Knows the time and expected time for a task. | Gathers all expected materials for task. | Recognizes own and other's role for given situation. |
Purpose | Understands the function of the space for the situation. | Aware of time available. | Materials are organized within personal space for functional use. | Recognizes and expresses key purpose of communication exchanges. |
Predicts | Navigates the space efficiently. | Uses if-then reasoning to envision future, has sequence of time markers, anticipates what's next. | Can recognize similarities and differences between materials. | Makes inferences about communication based on situation. |
Flexibility | Transitions between spaces efficiently. | Can shift pace. | Sees necessicty of objects to meet future goal, can inhibit materials not related to goal. | Regulates actions based on awareness of others (verbal & nonverbal). |
This system also includes rating scales which rank stability, complexity, and variability, familiarity of situation; arousal, concentration, distractibility, mental capacity of the child; and other influences on the environment and child on executive functions.
Dynamic Assessment
Dynamic assessment provides mediation around the task to elicit improved performance in order to examine responsiveness to instruction and assess the child's learning potential. Dynamic assessment strategies for WM may include repeating parts of a recall list, explicitly teaching chunking strategies, adjusting the number and nature of items, inclusion or exclusion of distractions, presence of visuals, and time to respond (Archibald, 2018). Overall, dynamic assessment identifies children who benefit from prompts or strategies which informs scaffolds that may be used in intervention planning.
References
Archibald, L. M. D. (2018). The reciprocal influences of working memory and linguistic knowledge on language performance: Considerations for the assessment of Children with Developmental Language Disorder. Language Speech and Hearing Services in Schools, 49(3), 424–433. https://doi.org/10.1044/2018_lshss-17-0094
Boudreau, D., & Costanza-Smith, A. (2011). Assessment and Treatment of Working Memory Deficits in School-Age Children: The Role of the Speech-Language Pathologist. Language, Speech & Hearing Services in Schools, 42(2), 152–166. https://web-s-ebscohost-com.proxy1.lib.uwo.ca/ehost/pdfviewer/pdfviewer?vid=0&sid=6042988f-2152-4c44-9798-285dbbe3f07e%40redis
Cabbage, K., Brinkley, S., Gray, S., Alt, M., Cowan, N., Green, S., Kuo, T., & Hogan, T. P. (2017). Assessing Working Memory in Children: The Comprehensive Assessment Battery for Children – Working Memory (CABC-WM). Journal of Visualized Experiments, 124. https://doi.org/10.3791/55121
Cohen, M. J. (2010). Children’s Memory Scale. In Encyclopedia of Clinical Neuropsychology (pp. 556–559). https://doi.org/10.1007/978-0-387-79948-3_1532
Diamond, A. (2020). Executive functions. In Handbook of clinical neurology (pp. 225–240). https://doi.org/10.1016/b978-0-444-64150-2.00020-4
Gioia, G., & Isquith, P. (2013). (BRIEF) Behavior Rating Inventory of Executive Function. https://www.wpspublish.com/brief-behavior-rating-inventory-of-executive-function
Pham, T. (2021). Evaluating the Modified Shortened Token Test as a working memory and language assessment too. Language & Working Memory Lab. [Video]. YouTube. https://www.youtube.com/watch?v=KWfJ6cbX7HA
Nelson, N., Plante, E., Helm-Estabrooks, N., & Hotz, G. (2014). Test of Integrated Language & Literacy Skills (TILLS).
Secord, W., Semel, E., & Wiig, E. (2013). The Clinical Evaluation of Language Fundamentals, Fifth Edition (CELF-5). Canadian Journal of School Psychology.
Ward, S., & Jacobsen, K. (2014). Executive Function Situational Awareness Observation tool. Perspectives on School-Based Issues, 15(4), 164–173. https://doi.org/10.1044/sbi15.4.164



Comments