
Fingerspelling.xyz
Technology Connections Project for HCDE 501- Theoretical Foundations of HCDE
Pranali, flo (tflo.info)
Overview
Purpose
The goal of the project was to analyze a technology using three theoretical perspectives in Human-Computer Interaction. Fingerspelling.xyz is a tool to teach the ABCs of American Sign Language. It was created as a collaboration between the American Society for Deaf Children and the agency Hello Monday. There are four levels of difficulty. The student watches the 3D hand and mimics the gesture to sign a letter. As they mimic the gesture, real-time feedback is provided through an API created using machine learning and computer vision software.
Methodology
Three major HCI perspectives were used for the analysis, namely-
1. Activity Theory
2. Social-behavioral Theory
3. Socio-material Theory.
By studying each theoretical perspective in detail, we found out how various aspects of the theories fit it well with the technology.
​
Analysis
1. Activity Theory
Activity Theory is a theory grounded in the foundations of psychology and was founded by Lev Vygotsky. It was then also contributed to by Leontiev and Engstrom. For this analysis, the activity of a user using the Fingerspelling.xyz technology was analyzed with the Activity, Actions, and Operations involved. It was used to explore micro-interactions of learning a letter. This helped gauge the effectiveness of the technology's design for accomplishing the broader object: learning the ABCs of American Sign Language.
a. Analyzing the micro-interactions of the letter “Q”. This is the first and easiest level. Using Activity Theory we can assess the micro-actions and operations involved; and the gradual shift from action to operation
The user holds her hand up, this is a familiar body movement: Operation.
She analyzes the teacher’s hand, its position, where its fingers are, the angle,
etcetera: Action.
She performs the gesture roughly, perhaps what feels most familiar
to her: Operation.
She sees the lines in the screen aren’t turning purple or looks at the
percentage in the middle, these indicate she doesn’t have the correct
hand gesture: Operation.
She shifts her hands and fingers to try to match the lines: Action.
She looks back at the teacher’s hand and tries to understand its
gesture again: Action.
This cycle repeats until she gives up on the activity.
The number of explicit actions the user is required to do is in stark contrast to simpler letters.
b. Analyzing the micro-interactions of the spelling “Able”. This is the first and easiest level. Using Activity Theory we can assess the micro-actions and operations involved; and the gradual shift from action to operation.
Here, the user completes the letters A, B, L, and E with ease and speed. Many of the actions listed earlier are operationalized.
Activity: Spell “Able”
Action: Read instructions
Operation: Hold hand
Action: Look at the 3d hand
Action: Move finger
Action: Look at the "mirror" and lines
Action: Shift hand/finger
Operation: Keep shifting while looking at the lines
Operation: When a finger is in the proper position, hold fingers in place
Operation: Wait for feedback (% completion or moving on to the next letter)
​
What Activity Theory can elucidate here are the cues or tools missing to help students understand the gesture they are mimicking. Going further, we can run usability tests to record the micro-seconds it takes to sign a letter and whether or not the student is signing quicker, an indication their actions are turning into operations.



2. Socio-Behavioral Psychology
This perspective helps analyze psychological frameworks that
make the technology efficient in enabling ASL learning.
a. Computers As Social Actors (CASA Model)
There is a strong semblance of a human teacher that the
technology elicits.
-
The user tends to treat the finger-spelling AI tutor as a human
tutor. -
The user tends to apply the rules of human-human interaction automatically.
-
Anthropomorphism- As you can see, the website has little prompts, like “Let’s go”, that resonate with how a tutor or helper would interact with a person. The technology is humanized in that sense.
b. Captology
The technology also seems to elicit some form of persuasion, concerning its gamification mechanism.
After each level, it praises the user by displaying “Well Done”. It has a score computed and encourages the users to try again to improve their score. (expertise cues, encouragement).
c. Agency Model of Customization
The tailored feedback offered by the technology, the rich modality of the user’s hand signing, and the playfulness of the website, elicit the “self-as-source” factor.
The user feels that they are in control of the interaction.
d. Cognitive Load Theory and Limited Capacity Model
Humans have a limited amount of cognitive resources. This can prevent them from meticulously processing information provided by the technology and at the same time, attending to the other interactivity afforded by it.
The user’s attention is divided between- exerting control over the interactive technology (by matching their sign) and processing the incoming information (feedback and match percentage) simultaneously. During signing each word, the user has to-
-
Look at the hand gesture on the opposite side of their dominant hand
-
Look at how they are singing it on the dominant-hand side
-
Check if what they have signed is a match
-
Assess and adjust according to the feedback till they see a 100% match
e. Cognitive Processing
The technology-induced multi-tasking between perceiving and encoding the sign shown and enacting it, can reduce cognitive performance as well as learning.
f. Modality-Agency-Interactivity-Navigability (MAIN) Heuristics
Affordances in the technology website trigger the following heuristics:
-
Social Presence Heuristic- In the interface, the human hand + the human-like tutor AI evokes the social presence heuristic.
-
Realism Heuristic- Seeing the hand sign by enacting instead of merely seeing a static image of it, evokes the realism heuristic.
-
Novelty Heuristic- Compelling aesthetics, does not allow time for the user to realize that they are not memorizing the words. This heuristics also promote a richer technology engagement.
-
Machine Heuristic- The tension with how the hand moves, and how it looks, the purple lines look very technical, and could potentially have some trust issue repercussions.
g. Flow Theory
The immersive nature of this technology, coupled with how the user has to be attentive and alert to the sign shown as well as the prompt feedback provided, enables the user to elicit “flow”. (“Psychological state where the user is so attentive to the experience that nothing else seems to matter”) (Csikszentmihalyi, 1975).
​
3. Socio-Material Theory
a. Set up: Blurs distinction between subject and object moving us away from Cartesian duality:
Materiality instantiated in existence
If there isn’t a human for the ML to observe, nothing is enacted. The human cannot do this alone- they need the tech
The learner is “melding” with the tutor (ML-generated) “hand” (the purple lines)
b. Part A: Influences on human
By enacting the sign and receiving feedback, dynamic “equality”
The tutor being the teacher has an inherent position in the power dynamics of the interaction, as in, the learner has to change their behavior to match the materiality/tutor
c. Part B: (and vice versa) Influences on materiality
However, conversely, when a human is responding to the ML and when the gesture is correct, then the enactment of learning occurs in the human and potentially the points/KPIs/metrics, in turn, can inform/change the design
d. Embodied knowledge
Extension of our bodies- it can break when we struggle- novices will experience cuts- or we run into use case-
Analysis that detaches- tech is present at hand- you could use but you are having trouble using it present-at-hand vs ready-to-hand– not intuitive yet— fingerspelling will elicit switches between these states
HOWEVER, the learning doesn’t continue past this one-off experience, so users are actually perpetually novices.
e. Extended embodiment
As mentioned, there is an extension with the laptop itself. You stop seeing a distinction between yourself and your computer. You see you. You are the hand you’re seeing on the screen.
HOWEVER, A lot of ASL also uses their arms and signing from the waist up and of course with another person to sign with. That’s how we learn language, it’s an “outcome of cultural processes and disciplined engagement” (Ramiller p24)
It's not just the tools but also the environment- the environment is limiting for the application of language- improvements in spatial considerations
f. Human cognition is driven by culture- you cannot just have a technique without having a culture motivating it. It is very isolated compared to other cultural pushes
g. THEREFORE, this experience talks to the broader phenomena of ASL and its disability community. How does this experience affect audiences on their views of disabilities, the deaf and hard of hearing community, and language? The tool, when arrives in our hands- changes our minds- it changes the very concept of ourselves for us. Makes us more aware
The internal biases of the individuals who trained the ML model might seep in they might not be experts in the first place- worth considering.
Improving The Design
For Better Learning
Instead of merely being used as a gamification version for a few words, fingerspelling.xyz has the potential to be used as a platform to learn ASL efficiently. As covered in the Psychological perspective, the interaction of perceiving the sign, enacting it, and then moving forward to other signs, happens too quickly for the user to comprehend the word they just signed. In order to make the most of the potential that the platform, here are some suggestions that were brainstormed-
-
Suggestion one- Slow the speed for signing each letter.​​​​
-
The way the ML tutor signs the letters is pretty fast, and users can miss out ​on an opportunity to actually mindfully sign a letter and in turn, a word. Slowing down the speed of the tutor can help users learn and retain better.
-
-
Suggestion two- Instead of starting with words, it might be more fruitful to start with individual letters (like A, B, C, and D instead of words
-
Just like any other language, it could be fruitful to consider getting the users introduced to the basic letter in ASL, and then moving on to words.​
-
-
Suggestion three- Have more detailed practice levels, along with small quizzes for better learning and retention.
-
Just signing the word, and then moving on other ones rapidly may not allow users to grasp what they signed. In order to make learning their worthwhile, users can be presented with quizzes, small and comprehensive, depending on their interests, to solidify their learning.​
-
-
Suggestion four- Have the guide sign a word- and have you identify the word- reciprocation practice.
-
To crystalize learning of the users, the tutor can have a reverse version of the quiz where it has to sign a word, and then the users have to identify what it was correctly.​
-