Who Decides What Scared Is?
A couple of weeks ago I took part in a hackathon that focused on hacking for social impact. It brought social problems to the forefront with an opportunity for using technology to solve them. Lately I’ve been trying to challenge myself to rethink computing solutions in terms of AI, especially in terms of outcomes, biases, and security.
For one, I have always been interested in pursuing computing for social good, so a platform like this was super exciting. On the other side of it was an opportunity to build solutions in non-typical problem spaces, which forced me to think differently about how I approach and solve a problem.
The Problem
I was faced with a problem that was intrinsically unique. We wanted to utilize large language models and other language AI tools for their benefits, but we also needed guardrails. Guardrails around personal information, around the content that we capture and the content that we can store, especially relative to legal limits like subpoenas. All of this while still letting the end user be more efficient and productive.
In most engineering work, you’re building with safety in mind but with well-understood tradeoffs. Tradeoffs where sometimes the worst case is a bug you fix later. Here, the product was still needed but the constraints were very clear, getting the balance wrong had really scary consequences for the people it served.
Part of our problem was solving language translation in real time during a sensitive conversation that required no distractions, the kind of conversation where you don’t want to break eye contact once trust is built. We also had a multicultural dynamic where the norms in terms of behavior and body language could be different, and the person using the tool might not know that. So how do we make sure those nuances are preserved and not judged by a tool that doesn’t have the knowledge to make the right call?
That question forced me to think of creative ways to use language models to transcribe conversations to flat text while capturing the sentiments of the conversation very specifically, but not making judgments without a human in the loop.
For example, the model could register that a speaker paused, or that their voice trembled, but it could never say they seemed scared, because who decides what scared means in text, especially across cultural norms? A long pause might mean distress, or it might mean someone translating a thought from their first language before responding. Whispering might signal fear, or it might just be how someone speaks when processing something difficult in a language that isn’t their own. A shift in vocal pace could be anxiety, or it could be comfort, someone slipping back into the rhythm of their mother tongue. If the AI is trained on English speakers and English-speaking norms, it has no reliable way to interpret these cues from someone outside of that context.
This turned out to be one of the most important design decisions in the whole system, because interpretation is where bias lives. The moment you let the model cross that line from describing to interpreting, you’ve embedded an assumption about what those signals mean. And in a sensitive context, that assumption carries weight. I went with a design that transcribes a conversation in real time into flat text, capturing the critical pieces — things like “[speaker paused for 4 seconds]” or “[voice tremor]” or “[pace of speech increased]” Factual observations, not conclusions. The human in the loop makes the call on what those moments conveyed when they review the data.
Whose Norms?
The reason this mattered so much is that we were building for a subculture where conventional Western semantics might not apply directly. And that forced a deeper question: what kind of data are the frontier large language models trained on? Whose norms are digitally documented enough to be captured as training data?
The internet isn’t a mirror of the world. It’s a mirror of the parts of the world that are online, well-documented, and producing text at scale. Entire cultures, communication patterns, and ways of expressing distress are invisible to the model because they simply aren’t in the data. When you build for a community whose norms aren’t well-represented, the model’s “common sense” is someone else’s common sense. What a conversation sounds like, what distress sounds like, what family looks like, all of it shaped by data that may have nothing to do with the people you’re building for. The tools are built for and by the people who are already digital, already documented, already producing the text that trains the next generation of models. Everyone else inherits a system that was never designed with them in mind.
I was really challenged to think through all the places where we can introduce bias, in the transcription, in the extraction, in the interpretation, in the assumptions baked into the schema itself, and all the places where I need to be super cautious as a builder.
The Left Behind
I really appreciated the opportunity to work on something like this, and I continue to seek out places where I am challenged to build products in the AI world. Every time I step into a new problem space, I come out thinking differently about the ones I was already in.
This one forced me to think about the gaps in the current AI ecosystem and the people that get left behind at every stage of a technical revolution, even as we enter this new wave. It’s not just that these communities are the last to get tools built for them. It’s what happens when they do.
When AI tools reach communities whose norms aren’t reflected in the training data, they arrive carrying assumptions. A model trained predominantly on one culture’s data doesn’t just fail to understand another culture, It overwrites it. Adoption in these communities set a default for what normal looks like, what a proper response sounds like, what the “right” way to express something is, except it isn’t their right way.
Communication patterns shaped by generations of cultural context, reduced to whatever the model was trained to recognize. Ways of expressing pain, trust, hesitation, all filtered through a lens that was never built for them. The richness and specificity of how different communities interact doesn’t get captured. It gets flattened. And the more these tools become the standard, the more that flattening accelerates. Their norms, their ways of doing things, get quietly washed away by tools that treat the incomplete training data as universal truth.
As builders, we have to think about this. It’s not enough to ask “does this tool work for everyone?” We have to ask “whose version of the world did we encode into this tool, and what happens to the communities that version doesn’t include?”
I keep coming back to this: how do we set ourselves up to bridge that gap? How do we make sure that this wave of technology reaches the people it could help the most, without erasing the things that make their communities what they are?
I don’t have a clean answer, but every day I am challenging myself to be in spaces where I am forced to keep these in mind.