Publications
Peer-reviewed research and contributions in machine learning
Computational Attestations of Polynomial Integrity Towards Verifiable Back-Propagation
Future Conference Technology (FTC), 2025
Cryptology ePrint Archive, 2025
View Paper →Show Abstract
Recent advancements in machine learning accuracy and utility have been driven by the effective combination of sophisticated models with high-performance computational scaling. As the development of large-scale models shifts away from commodity hardware to outsourced computation, it becomes paramount to ensure that the training process is executed with integrity and transparency. This encompasses verifying that adequate computational resources were expended and that the resulting model is accurate, rather than the product of skipped steps or resource-saving shortcuts by the external provider. Building on our previous efforts, which demonstrated the computational feasibility of using this system to argue correctness for differentially-private linear regression, we extend those results to achieve fully provable back-propagation—a cornerstone operation in modern machine learning training. Our system achieves complete zero-knowledge, revealing nothing about the input data during training, and ensures quantum security by relying on no weak cryptographic primitives. Efficiency is substantially increased through the use of a fixed-point decimal representation, reducing the computational overhead typically associated with floating-point arithmetic. Notably, our solution is doubly efficient, achieving a logarithmic-time verifier and a linear-time prover. Implemented entirely in Rust without reliance on external machine learning libraries, and executed within a cryptographically secure virtual machine, this work represents a significant advancement toward verifiable, secure, and efficient outsourced machine learning computations.
Large Language Model-Powered Conversational Agent Delivering Problem-Solving Therapy (PST) for Family Caregivers: Enhancing Empathy and Therapeutic Alliance Using In-Context Learning
American Medical Informatics Association (AMIA), 2025
arXiv, 2025
View Paper →Show Abstract
Family caregivers often face substantial mental health challenges due to their multifaceted roles and limited resources. This study explored the potential of a large language model (LLM)-powered conversational agent to deliver evidence-based mental health support for caregivers, specifically Problem-Solving Therapy (PST) integrated with Motivational Interviewing (MI) and Behavioral Chain Analysis (BCA). A within-subject experiment was conducted with 28 caregivers interacting with four LLM configurations to evaluate empathy and therapeutic alliance. The best-performing models incorporated Few-Shot and Retrieval-Augmented Generation (RAG) prompting techniques, alongside clinician-curated examples. The models showed improved contextual understanding and personalized support, as reflected by qualitative responses and quantitative ratings on perceived empathy and therapeutic alliances. Participants valued the model's ability to validate emotions, explore unexpressed feelings, and provide actionable strategies. However, balancing thorough assessment with efficient advice delivery remains a challenge. This work highlights the potential of LLMs in delivering empathetic and tailored support for family caregivers.
Computational Attestations of Polynomial Integrity Towards Verifiable Machine-Learning
Future Conference Technology (FTC), 2024
arXiv, 2025
Cryptology ePrint Archive, 2024
View Paper →Show Abstract
Machine-learning systems continue to advance at a rapid pace, demonstrating remarkable utility in various fields and disciplines. As these systems continue to grow in size and complexity, a nascent industry is emerging which aims to bring machine-learning-as-a-service (MLaaS) to market. Outsourcing the operation and training of these systems to powerful hardware carries numerous advantages, but challenges arise when privacy and the correctness of work carried out must be ensured. Recent advancements in the field of zero-knowledge cryptography have led to a means of generating arguments of integrity for any computation, which in turn can be efficiently verified by any party, in any place, at any time. In this work we prove the correct training of a differentially-private (DP) linear regression over a dataset of 50,000 samples on a single machine in less than 6 minutes, verifying the entire computation in 0.17 seconds. To our knowledge, this result represents the fastest known instance in the literature of provable-DP over a dataset of this size. We believe this result constitutes a key stepping-stone towards end-to-end private MLaaS.
Toward Large Language Models as a Therapeutic Tool: Comparing Prompting Techniques to Improve GPT-Delivered Problem-Solving Therapy
American Medical Informatics Association (AMIA), 2024
arXiv, 2024
View Paper →Show Abstract
While Large Language Models (LLMs) are being quickly adapted to many domains, including healthcare, their strengths and pitfalls remain under-explored. In our study, we examine the effects of prompt engineering to guide Large Language Models (LLMs) in delivering parts of a Problem-Solving Therapy (PST) session via text, particularly during the symptom identification and assessment phase for personalized goal setting. We present evaluation results of the models' performances by automatic metrics and experienced medical professionals. We demonstrate that the models' capability to deliver protocolized therapy can be improved with the proper use of prompt engineering methods, albeit with limitations. To our knowledge, this study is among the first to assess the effects of various prompting techniques in enhancing a generalist model's ability to deliver psychotherapy, focusing on overall quality, consistency, and empathy. Exploring LLMs' potential in delivering psychotherapy holds promise with the current shortage of mental health professionals amid significant needs, enhancing the potential utility of AI-based and AI-enhanced care services.