Furqan Ali

Current image: Home office desk with laptop, plant, whiteboard sketches, and coffee mug

I build infrastructure that makes systems learnable and auditable. Not just stable. Stable is the baseline. What I care about is whether a system teaches you something when it breaks. Can you trace what happened? Can you understand why? Can you prevent it from happening again? That’s what I mean by learnable and auditable. That’s the standard I hold myself to.

Before I write a single line of code, I ask myself: what problem am I actually solving? What does good look like here? That question has guided me through research at Stanford, where I helped cut the performance gap to GPT-4.1 from 80% to 10%. It guided me through engineering at Capital One, where I built data pipelines that improved onboarding conversion by 20%. And it guided me through full stack development at Everyone Can Code, where I shipped NLP applications that increased engagement by 25%. The work changes, but the question stays the same.

My approach to AI ethics:

I don’t think about ethics as a set of abstract principles that I write down and forget about. I think about it as operational hygiene. It’s the everyday discipline of building systems that are transparent, traceable, and accountable. If I can’t look back at a decision and understand why it was made, I can’t trust the system. And if I can’t trust it, I shouldn’t ship it.

That’s why I build gates and audit trails into everything I ship. Not because someone told me to, but because I’ve learned the hard way that gaps are invisible until you build the infrastructure to see them.

Here’s a real example:

I was running an evaluation pipeline for AI feedback and noticed something that didn’t sit right. Comments from non-native English writers were being labeled as unclear and quietly dropped from the urgent queue. The model wasn’t malicious. It was just doing what I asked it to do. But I had given it an easy out without thinking about who that would affect.

When I dug into the numbers, I found that we were missing about 40% of urgent issues from people who were already struggling to be heard. That wasn’t a model problem. That was a design problem. So I built a gate. A simple override that flagged unclear comments for human review if they contained certain keywords. No new model. No complex fix. Just a deliberate decision about where I could trust the model and where I couldn’t.

That one change improved recall from 60% to 85%. More importantly, it taught me something. Ethics in AI isn’t about having the right principles. It’s about building systems that catch your blind spots before they become someone else’s problem. That’s what operational hygiene looks like in practice.

What I’m working on now:

I’m currently contributing to the WordPress ecosystem through the Make program, focusing on testing frameworks and performance measurement tools. I believe open infrastructure is a public good, and I want to help make it more reliable and transparent for everyone who depends on it.

When I’m not building, I’m writing about what I learn. Documenting failures. Sharing patterns. Iterating on the small improvements that add up over time. That’s how I get better. That’s how the work gets better.

This version is longer, more personal, and explains the bias fix project in full instead of just linking to it. It also removes the slashes and reads more naturally. Let me know if you want any adjustments.