Projects
A collection of my projects and experiments. Many of these started as ideas.
H2 Cognivolve [2025]
Cognivolve applies a four-stage “easy-to-hard” curriculum that trains a 124 M-parameter GPT-2 from surface lexical matching up to multi-step symbolic reasoning in a single uninterrupted run. The syllabus unlocks an order-of-magnitude more gradient-salient reasoning heads, reallocates them into deeper transformer layers, and halves the optimisation steps required to reach moderate accuracy—all without extra parameters or compute. Attention-map and PCA analyses reveal a richer blend of local and long-range context, while progressive-specialisation tracking shows that early-emergent circuits persist and compound across stages. Although final-answer accuracy still trails a conventional baseline, the gains in interpretability, sample efficiency, and modular depth show that structured curricular progression can substitute for scale when cultivating reasoning in small language models.
Final project for Spring 2025’s CAS CS 505 Introduction to Natural Language Processing.
Code:
https://github.com/modulariumresearch/cognivolve
Paper:
https://arxiv.org/abs/2505.11643
Poster:
https://repo.fufoundation.co/research/cognivolve/cognivolve-poster.pdf
Advisor:
Andrew Wood
Acknowledgement:
Andrew Wood,
Nazia Tasnim,
Nilay Jain,
Jun Wang, and
Chakkai Yip
H2 RoR [2025 - Present]
Advisor:
Najoung Kim
Collaborator:
Seungmin Cho
H2 Chess with Human Reasoning [2025 - Present]
A chess engine that uses simulated human reasoning to play chess.
Code:
https://github.com/modulariumresearch/chess-reasoning
Acknowledgement:
Jun Wang
H2 Sudoku Solver with Human Reasoning [2024 - Present]
A sudoku solver that uses simulated human reasoning to solve sudoku puzzles.
Code: https://github.com/modulariumresearch/sudoku-reasoning
H2 Open TacticAI [2024 - Present]
An open-source implementation of Google DeepMind’s Paper: TacticAI: an AI assistant for football tactics.
H2 Decentralized Neural Traffic Orchestration [2024]
We present DNC, a cooperative multi-agent reinforcement learning framework for traffic signal optimization that introduces two key innovations: an Integrated Deep Q-Network (DQN) combining hierarchical learning, shared experience, and state-sharing mechanisms, and a novel Communication-Aware State Fusion (CASF) method that uses attention mechanisms to dynamically weight state information based on neighborhood topology.
Final project for Fall 2024’s CDS DS 340 Introduction to Machine Learning and AI.
Code:
https://github.com/modulariumresearch/dnc
Advisor:
Kevin Gold
Collaborator:
George Jiang
Acknowledgement:
Jun Wang
H2 Sedona [2024]
A Jupyter Notebook converter to TXT or Markdown.
Website: https://sedona.fufoundation.co
H2 Democratica
Final project for Summer 2024’s CAS PO 399 / 599 Data Science for Politics.
Code:
https://github.com/modulariumresearch/democratica
Advisor:
Ahyoung Cho
H2 KITE: Kernel-based Inference of Trait Epistasis [2024]
A Python package for detecting epistatic interactions in quantitative trait loci (QTL) data using kernel-based methods.
Final project for Spring 2024’s CDS DS 596 Foundations of Biological Data Science.
Code:
https://github.com/modulariumresearch/kite
Advisor:
Brian Cleary
H2 CoPDA:A Large-Scale Analytics System for Computational Policy Document Analysis of Chinese Government Policy Documents [2024]
A large-scale analytics system for computational policy document analysis of Chinese Government Policy Documents.
Code:
https://github.com/modulariumresearch/copda
Advisor:
Bo Feng
H2 BU Course Search Command Line Interface [2023]
A command line interface for searching for courses at Boston University.
H2 COVID-19 Policy Analysis with Microsoft Azure [2023]
A data analysis project using Microsoft Azure to analyze COVID-19 data.
Final project for Fall 2023’s CDS DS 310 Data Mechanics.
Code:
https://github.com/craigxiangfu/ds310-covid-policy-analysis
Members:
George Jiang,
Hannah Choe,
Jaden Hsiao,
Riya Parikh,
Tracy Cui,
Xiang Fu
Advisor:
Chris Seferlis
H2 CRNSA: California Road Network Schaubild Analytica [2023]
A network analysis project focused on the road network in California. We explored the road network’s structural properties, connectivity, and traffic patterns.
Final project for Spring 2023’s CDS DS 210 Programming for Data Science.
Code:
https://github.com/craigxiangfu/crn-schaubild-analytica
Advisor:
Leonidas Kontothanassis
H2 CBRP [2022]
Final project for Fall 2022’s CDS DS 110 Introduction to Data Science with Python.
Code:
https://github.com/craigxiangfu/ds110-cbrp
Collaborator:
Kayla Wu
Advisor:
Kevin Gold
H2 GomokuPro [2021]
A novel algorithm leveraging a Convolutional Neural Network and Monte Carlo tree search. We aim to address the long computation times faced by traditional algorithms in board games. By transforming a multidimensional fractional model into an adaptive model, we built upon a previously developed single-strategy machine learning framework capable of making judgments and predictions using human-like decision algorithms.
Paper:
https://ieeexplore.ieee.org/document/9778476
Code:
https://github.com/modulariumresearch/gomokupro
Advisor:
Vipul Goyal