TL;DR
We introduce DAB (Discrete Auto-regressive Biasing), a novel technique that outperforms existing methods for controlled text generation. DAB achieves state-of-the-art results in sentiment steering, toxicity reduction, and keyword inclusion while maintaining superior text fluency and grammatical correctness.

Controlled LLM Decoding via Discrete Auto-Regressive Biasing

Patrick Pynadath Ruqi Zhang
Purdue University, Department of Computer Science,

Abstract

Controlled text generation allows for enforcing user-defined constraints on large language model outputs—critical as LLMs become increasingly prevalent. While energy-based decoding methods combine multiple constraints through weighted averages, they often struggle to balance fluency with constraint satisfaction.

We identify that this suboptimal balance stems from sampling in continuous space rather than the natural discrete space of text tokens. Our solution, Discrete Auto-regressive Biasing (DAB), leverages gradients while operating entirely in the discrete text domain.

DAB introduces a novel formulation by defining a joint distribution over the generated sequence and an auxiliary bias sequence. To efficiently sample from this distribution, we propose a Langevin-within-Gibbs sampling algorithm using gradient-based discrete MCMC.

Our method significantly improves constraint satisfaction while maintaining superior fluency—all with reduced computational costs. Experiments demonstrate DAB's advantages on sentiment control, language detoxification, and keyword-guided generation tasks.

DAB Overview
Figure 1: DAB Overview

High-level diagram for the proposed DAB algorithm. Given prompts and an external constraint, DAB iteratively improves constraint satisfaction while preserving fluency by leveraging discrete sampling to work in the natural discrete domain of text.

Problem Statement

Current energy-based controlled decoding methods face a fundamental challenge: they operate in continuous token probability space, while natural language is inherently discrete.

This mismatch leads to suboptimal balance between constraint satisfaction and fluency, requiring extensive hyperparameter tuning of energy function coefficients with limited success.

⚠️Key Challenge: How can we perform effective controlled text generation while embracing rather than avoiding the discrete nature of text tokens?

Framework

We propose a novel formulation for controlled text generation that operates natively in discrete token space, avoiding the continuous approximations that limit previous approaches.

Joint Distribution: Our method defines a joint distribution over both the generated sequence X and an auxiliary bias sequence B.

This approach models the ideal balance between two objectives:

  • Constraint Satisfaction: Measured by control metric performance
  • Language Fluency: Maintained by preserving the LLM's original distribution

By formulating the problem in discrete space, DAB better captures the true distribution of well-formed text while satisfying constraints.

Algorithm

To efficiently sample from our joint distribution, we implement a Langevin-within-Gibbs sampling algorithm using gradient-based discrete MCMC. Our algorithm uses gradient-based discrete sampling to obtain a bias sequence that satisfies the external constraint, but may not be fluent. Given this satisfactory sequence, we then produce a fluent sequence through auto-regressively generating a response. This alternating cycle of sampling the bias sequence and sampling the response sequence can be seen as Gibbs sampling, where we sample the bias sequence conditioned on the response, and then the response sequence conditioned on the bias.

Algorithm 1: Discrete Autoregressive Biasing
Require: Constraint function f, PLM, prompt X, number steps s, sequence length n, embedding table M
1: B̃ ← 0⃗, fmin ← -∞, Ybest ← {} // Initialize constraint violation as maximal and best generation as empty
2: for step in 1 to s do
3: for position i in range(n) do
4: ỹi ← log PLM(· | y<i, X) // Initial auto-regressive distribution over V
5: Calculate normalizing factor ri if s > 1, else ri ← 1
6: yi ← argmaxj∈|V|(ỹi,j - wi · ri · b̃i,j) // Sample from P(Y | X, B)
7: end for
8: B ← Y // Initialize B as Y
9: Evaluate f(B | X), update fmin, Ybest
10: B' ~ qτ(· | B) as in equation (9) // Approximately sample from P(B | X, Y)
11: Compute B̃ as in equation (10)
12: end for
13: return Ybest
DAB Algorithm
Figure 2: DAB Algorithm

More technical diagram of DAB algorithm for a single iteration. Given the previous Bias sequence and response sequence from the LLM, we first compute a distribution to increase constraint satisfaction, leveraging gradient information from the constraint function. We then map these tokens to a penalty vector, using the embedding table to compute a distance penalty. We then incorporate these bias vectors into auto-regressive generation, effectively steering the LM towards satisfactory generations while preserving the fluency of the original LM distribution.

Empirical Results

We compared DAB with several strong baselines.

  • MuCoLA (Multi-Constraint for language models with Langevin Dynamics): Uses weighted average of multiple constraints and performs non auto-regressive sampling in the embedding space.
  • COLD (Constrained Decoding with Langevin Dynamics): Uses continuous Langevin Dynamics in the logit space.
  • BOLT (Biases over Logits): Applies continuous sampling for bias vectors that are applied during auto-regressive generation.
  • LM-Steer (Language Model Steering): Learns a steering matrix that is then applied to each generated embedding during auto-regressive decoding.
Sentiment Control
Toxicity Reduction
Keyword Inclusion

In the Sentiment Control experiment, we evaluated each method's ability to steer language model generations toward positive or negative sentiment. Models were prompted with neutral text and instructed to continue with either positive or negative sentiment. Performance was measured using both our internal sentiment classifier and an external benchmark classifier to ensure robust evaluation. We also measured fluency metrics to ensure controlled generation maintained language quality.

Control Metrics

Internal Classifier
External Classifier

Fluency Metrics

Perplexity (Lower is Better)
CoLA Acceptability

Results Analysis

As shown in the chart, DAB maintains a better balance between control and fluency when compared to other methods. DAB simultaneously has the best control performance in regards to both the internal and external classification score while maintaining competitive fluency metrics, being beaten by only BOLT in regards to CoLA acceptability.

Qualitative Examples

Method Prompt Positive Sentiment Control
DAB (Ours) The horse The horse is also a very good and reliable companion. It has been used to carry the family's two-
MuCoLA The horse The horse is not only a beautiful and well-crafted piece of art, but it is also a great way
COLD The horse The horse head was still in the water, but the horse still had a good head. The horse
BOLT The horse The horseback riding course is a great way to get acquainted with the trails and the terrain. The course is
LM-Steer The horse The horseman delivers a stunningly beautiful, wonderfully lyrical, beautifully tender, powerfully moving, beautifully lyrical

Conclusion

We introduced Discrete Auto-regressive Biasing (DAB), a novel approach to controlled text generation that operates directly in discrete token space. Our method defines a joint distribution over the generated sequence and an auxiliary bias sequence, effectively addressing the limitations of continuous space methods.

Through extensive experiments on sentiment control, toxicity reduction, and keyword-guided generation, we demonstrated that DAB consistently outperforms existing methods. Our approach achieves superior constraint satisfaction while maintaining or improving text fluency, all with reduced computational requirements compared to baseline methods.

Future Work

Future research directions include extending DAB to handle multiple simultaneous constraints, developing more efficient sampling strategies for the bias sequence, and exploring applications in personalized content generation and domain-specific text adaptation.

BibTeX

@article{pynadath2025controlled,
  title={Controlled LLM Decoding via Discrete Auto-regressive Biasing},
  author={Pynadath, Patrick and Zhang, Ruqi},
  journal={ICLR},
  year={2025}
}