Hello, I'm

Troy Arthur

NLP/AI Researcher & Software Engineer

Top-ranked globally at SemEval 2026 · 3+ years distributed systems · MS Computational Linguistics @ CU Boulder

01. About

I'm a software engineer with 3+ years of industry experience in distributed systems and data infrastructure, currently pursuing my MS in Computational Linguistics at the University of Colorado, Boulder — where I also completed my BA in Psychology with minors in CS and Political Science.

My research focus is NLP and affective computing. In 2026 I competed in SemEval 2026, ranking competitively against 25+ international teams — including nationally funded research labs — and placing #1 globally on the Affective Valence prediction track, outperforming baseline by 24% with models trained entirely on a single local machine. A first-author paper on the work has been accepted to ACL Anthology (forthcoming).

I tend to find ways to make things work with what's available — I'm resourceful, and I take satisfaction in getting strong results from constrained setups. I'm looking for research-focused ML infrastructure roles that sit at the intersection of systems engineering and applied AI.

Technologies

  • Python
  • PyTorch
  • Hugging Face
  • Spark / Scala
  • C++
  • TypeScript
  • SQL
  • HDFS
  • Google Protobuf
  • NumPy / Sklearn
  • ChromaDB
  • spaCy
  • Linux Shell
  • Git

02. Experience

Software Engineer

Wiland, Inc.

Intern → Associate → Software Engineer

SparkScalaSQL C++TypeScriptPython HDFSProtobuf Linux ShellGit
  • Accelerated data pipelines through large-scale stack migration and distributed systems optimization.
  • Optimized legacy C++ software, achieving significant improvements in system throughput and performance.
  • Guided system architecture — produced ERDs and architecture diagrams to document and inform infrastructure design decisions.
  • Security focus — hardened data infrastructure against unauthorized access and system vulnerabilities.

03. Education

MS in Computational Linguistics

University of Colorado, Boulder
  • GPA: 4.0 · CLASIC Scholarship (Sole recipient)
  • Coursework: Natural Language Processing, Deep Learning, Advanced Algorithms, Morphology & Syntax

BA in Psychology, Minors in CS & Political Science

University of Colorado, Boulder
  • CS GPA: 4.0 · Latin Honors (thesis defense) · Undergraduate Research Grant
  • Roles: Undergraduate Research Assistant (Stereotyping & Prejudice Lab); Course Assistant (Discrete Structures)

04. Projects

Affective State Prediction

Novel ensemble and hierarchical transformer model to predict user emotional states from text. Competed in SemEval 2026, ranking competitively globally and placing #1 on the Affective Valence prediction track, outperforming 25 international teams by 24% above baseline — with all models trained locally on a single machine. First-author paper accepted to ACL Anthology (forthcoming).

PythonPyTorchHugging Face NumPySklearn

CoRefRAG

Coreference-motivated RAG system introducing three novel discourse-aware document chunking strategies grounded in entity accessibility theory. Coreference chains inform chunk boundaries to reduce referential ambiguity in retrieved context. Evaluated over 412 Wikipedia articles from SQuAD 2.0 with custom chunking metrics and a full downstream QA pipeline.

PythonHugging FaceChromaDB spaCyFastCorefClaude API BERTScore

Research Data Aggregator

High-performance C++ data processing pipeline for a CU Psychology research team. Replaced brute-force Python parsing with a single-pass sliding-window algorithm, cutting processing time from an estimated 6 months to under 5 minutes. Custom BST for geographic CSV indexing.

C++R

05. Contact

I'm actively looking for research-focused ML infrastructure roles. Whether you have a question, an opportunity, or just want to connect — my inbox is open.

Say Hello