Prashant Mehta (University of Illinois, Urbana-Champaign)

��ֱ��

Date

Friday March 13, 2026
2:30 pm - 3:20 pm

Location

Jeffery Hall, Room 234

Event Category

All Talks & Seminars

Department Colloquium

Speaker: Prashant Mehta (University of Illinois, Urbana-Champaign)

Title: What can we learn from signals and systems in a transformer? Insights for probabilistic modeling and inference architecture

Abstract:
Transformer is the name of the core algorithm inside a large language model (LLM). In the so-called decoder-only transformer, a finite sequence of symbols (tokens) is mapped to the conditional probability of the next token.

In this talk, I situate the transformer within the broader history of the prediction theory: In the early 1940s, Wiener introduced a linear predictor, where the conditional expectation of future data is computed by linearly combining the past data. I argue that a decoder-only transformer generalizes this idea and that a transformer is best understood as a causal nonlinear predictor. The technical results for causal nonlinear prediction are described for the special case where the data is discrete-valued and generated from an underlying hidden Markov model (HMM).

The aim of this on-going research is to bridge the classical nonlinear filtering theory with modern inference architectures inspired by transformers. The work is jointly carried out with Heng-Sheng Chang and Jin Won Kim, and the talk is based on the paper:

��ֱ��

Department of Mathematics and Statistics