All research directions

Foundation Distillation

Knowledge distillation, Wasserstein transfer, and efficient student models from large teachers.

To compress the knowledge of large foundation models into smaller, faster students without sacrificing performance on downstream tasks.

Overview

This direction studies how to distill large language, vision-language, and generative models into efficient students. Research spans Wasserstein knowledge distillation, hierarchical relational distillation, and data-free black-box transfer.

Key objectives

  • Develop principled Wasserstein and relational distillation methods
  • Enable data-free and black-box knowledge transfer
  • Distill vision-language and multimodal embedding models
  • Preserve teacher capability in compact student architectures

Key topics

  • Knowledge distillation for LLMs and VLMs
  • Wasserstein and optimal-transport distillation
  • Hierarchical relational distillation
  • Data-free and black-box distillation

Papers in this direction

  • 2026

    Diverse Image Priors for Black-box Data-free Knowledge Distillation

    Vo, TN, Nguyen, D, Le, T, Do, K, Gupta, S

    International Conference on Computer Communication and the Internet (ICCCI)

  • 2026

    MCW-KD: Multi-Cost Wasserstein Knowledge Distillation for Large Language Models

    Vuong, HT, Le, T, Tran, Q, Van, LN, Le, T

    Proceedings of the AAAI Conference on Artificial Intelligence

  • 2026

    HieRD: Hierarchical Relational Distillation for Vision-Language Embedding Models

    Le, V, Dang, N Hong, Vu, T, Van, LN, Nguyen, DA, Le, T

    International Conference on Machine Learning (ICML)