Abstract
Fine-grained visual categorization (FGVC) is a challenging task due to similar visual appearances between various species. Previous studies always implicitly assume that the training and test data have the same underlying distributions, and that features extracted by modern backbone architectures remain discriminative and generalize well to unseen test data. However, we empirically justify that these conditions are not always true on benchmark datasets. To this end, we combine the merits of invariant risk minimization (IRM) and information bottleneck (IB) principle to learn invariant and minimum sufficient (IMS) representations for FGVC, such that the overall model can always discover the most succinct and consistent fine-grained features. We apply the matrix-based Rényi's α-order entropy to simplify and stabilize the training of IB; we also design a “soft” environment partition scheme to make IRM applicable to FGVC task. To the best of our knowledge, we are the first to address the problem of FGVC from a generalization perspective and develop a new information-theoretic solution accordingly. Extensive experiments demonstrate the consistent performance gain offered by our IMS. Code is available at: https://github.com/SYe-hub/IMS.
Original language | English |
---|---|
Article number | 103837 |
Pages (from-to) | 1-11 |
Number of pages | 11 |
Journal | Computer Vision and Image Understanding |
Volume | 237 |
Early online date | 26 Sept 2023 |
DOIs | |
Publication status | Published - Dec 2023 |
Bibliographical note
Funding Information:This work was supported in part by the National Key R&D Program of China2022YFC3301000, in part by the Fundamental Research Funds for the Central Universities, HUST: 2023JYCXJJ031.
Funding Information:
This work was supported in part by the National Key R&D Program of China 2022YFC3301000 , in part by the Fundamental Research Funds for the Central Universities , HUST: 2023JYCXJJ031 .
Publisher Copyright:
© 2023 Elsevier Inc.
Funding
This work was supported in part by the National Key R&D Program of China2022YFC3301000, in part by the Fundamental Research Funds for the Central Universities, HUST: 2023JYCXJJ031. This work was supported in part by the National Key R&D Program of China 2022YFC3301000 , in part by the Fundamental Research Funds for the Central Universities , HUST: 2023JYCXJJ031 .
Funders | Funder number |
---|---|
National Key R&D Program of China2022YFC3301000 | |
Huazhong University of Science and Technology | 2023JYCXJJ031 |
Huazhong University of Science and Technology | |
National Key Research and Development Program of China | 2022YFC3301000 |
National Key Research and Development Program of China | |
Fundamental Research Funds for the Central Universities |
Keywords
- Fine-grained visual categorization
- Information bottleneck
- Invariant risk minimization