Welcome
Welcome to my personal website! Here you'll find information about me, my research, and my cats.
Profile Picture

Patrick Pynadath

PhD Student @ Purdue University
Based in West Lafayette, IN

About Me

Hello! I'm Patrick Pynadath, a passionate PhD student with a strong interest in probabilistic machine learning and discrete generative modeling. I am currently pursuing my PhD at Purdue University under the guidance of Prof. Ruqi Zhang.

In the past, I have worked on fundamental statistical methods, applications of sampling techniques to LLMs, and LLM safety. Currently, I am very excited about discrete diffusion.

Feel free to reach out if you'd like to connect or discuss potential collaborations!

My Cats

Missy
Missy
Boba
Boba

Meet my two research assistants. Missy specializes in Supervised Keyboard Tuning (SKT) (demonstrated when I am coding); while Boba is our resident expert in discrete nap optimization. They are a crucial component of every research project.

Research Projects

Side cat left
Side cat right
Preprint

🍭 CANDI: Hybrid Discrete-Continuous Diffusion Models

Authors: Patrick Pynadath, Jiaxin Shi, Ruqi Zhang

We figure out why continuous diffusion has struggled on discrete data, and introduce CANDI, a principled solution.

Overview

Continuous diffusion does extremely well on images, but struggles on discrete data. We introduce token identifiability to study Gaussian noise on discrete data and discover a temporal dissonance between discrete identity corruption and continuous rank degradation. Both are vital for continuous diffusion but become misaligned as the number of categories increases. We introduce CANDI, disentangling the two forms of corruption with an explicit masking schedule to coordinate them and bring continuous diffusion benefits to discrete spaces.

TL;DR

  • Token identifiability explains how Gaussian noise corrupts discrete data.
  • We find temporal dissonance between identity corruption and rank degradation that hurts continuous diffusion.
  • CANDI decouples and coordinates the corruptions to improve discrete diffusion.
Side cat left
Side cat right
NeurIPS 2025

🔓 VERA: Variational Inference Framework for Jailbreaking Large Language Models

Authors: Anamika Lochab, Lu Yan, Patrick Pynadath, Xiangyu Zhang, Ruqi Zhang

We use variational inference to introduce a scalable and effective framework for jailbreaking/red-teaming LLMs.

Overview

Black-box safety testing is crucial as many powerful LLMs are API-only. Existing genetic-algorithm jailbreakers need curated seeds and re-running per test case. VERA treats jailbreak generation as probabilistic inference: we train a small attacker model to learn the distribution of adversarial prompts. Once trained, VERA instantly produces diverse, natural jailbreaks without additional optimization, succeeding across target models.

TL;DR

  • Black-box jailbreaking reveals realistic vulnerabilities.
  • Genetic algorithms are brittle and costly to rerun.
  • VERA learns a distribution of jailbreaks for fast, diverse attacks.
Side cat left
Side cat right
ICLR 2025

🎛️ Controlled LLM Decoding via Discrete Auto-regressive Biasing

Authors: Patrick Pynadath, Ruqi Zhang

We use gradient-based discrete sampling to enable plug-and-play control over LLM generation.

Overview

Controlling LLM outputs requires balancing fluency with constraint satisfaction. Energy-based decoding in continuous space struggles with this trade-off. We introduce Discrete Auto-regressive Biasing (DAB), a decoding algorithm that stays in the discrete token domain using gradient-based discrete MCMC within a Langevin-within-Gibbs framework. It defines a joint distribution over text and auxiliary bias sequences to deliver better constraints, fluency, and lower cost.

TL;DR

  • Continuous-space control struggles with discrete text constraints.
  • DAB keeps sampling discrete while leveraging gradients.
  • Achieves better constraint satisfaction and fluency at lower cost.
Side cat left
Side cat right
NeurIPS 2024

🚲 Gradient-based Discrete Sampling with Automatic Cyclical Scheduling

Authors: Patrick Pynadath, Riddhiman Bhattacharya, Arun Hariharan, Ruqi Zhang

We enable discrete gradient-based sampling methods to deal with multi-modal distributions by introducing automatically tuned cyclical schedules.

Overview

Discrete distributions in deep models are highly multimodal. Gradient-based samplers get trapped in local modes. We propose automatic cyclical scheduling that alternates large exploratory steps with small exploitative steps, combines balanced proposals, and auto-tunes hyperparameters across datasets. This yields efficient multimodal sampling with convergence guarantees and strong empirical performance.

Technical Details

  • Multimodal discrete targets trap vanilla gradient samplers.
  • Cyclical schedules alternate exploration and exploitation with balanced proposals.
  • Automatic tuning adapts across datasets while retaining theoretical guarantees.

🎓Education

My academic background

PhD in Computer Science

Purdue University • Ongoing

MS in Computer Science

Northwestern University • December 2022

BA in Mathematics

Northwestern University • June 2022

📬Get In Touch

I'm always interested in connecting with like-minded people and exploring new opportunities.

Whether you're looking to collaborate on a project, have a question about my work, or just want to say hello, I'd love to hear from you!