Agent Engineer
I build AI agents.
Then I find out if they actually work.
Four projects, one shared engine. I grade them by exact matching and span overlap, not by asking another model if the answer looks right. Every number here is real, including the ones that did not go my way.