• Jump to Content
北京大学计算机研究所多媒体信息处理研究室
[中文版] [English Version]
Document Title
主页
新闻
成员
招生方向
研究方向
主要论文
科研项目
国际评测
相关报道
发明专利
开设课程
学生荣誉
学术交流
活动休闲
Source code

  • MAI:  https://github.com/PKU-ICST-MIPL/MAI_ICLR2025
    MAI: A Multi-turn Aggregation-Iteration Model for Composed Image Retrieval(ICLR 2025).

  • Finedefics:  https://github.com/PKU-ICST-MIPL/Finedefics_ICLR2025
    Analyzing and Boosting the Power of Fine-Grained Visual Recognition for Multi-modal Large Language Models(ICLR 2025).

  • FineSports:  https://github.com/PKU-ICST-MIPL/FineSports_CVPR2024
    FineSports: A Multi-person Hierarchical Sports Video Dataset for Fine-grained Action Understanding(CVPR 2024).

  • SIA-OVD:  https://github.com/PKU-ICST-MIPL/SIA-OVD_ACMMM2024
    SIA-OVD: Shape-Invariant Adapter for Bridging the Image-Region Gap in Open-Vocabulary Detection(ACM MM 2024).

  • FineFMPL:  https://github.com/PKU-ICST-MIPL/FineFMPL_IJCAI2024
    FineFMPL: Fine-grained Feature Mining Prompt Learning for Few-Shot Class Incremental Learning(IJCAI 2024).

  • Firzen:  https://github.com/PKU-ICST-MIPL/Firzen_ICDE2024
    Firzen: Firing Strict Cold-Start Items with Frozen Heterogeneous and Homogeneous Graphs for Recommendation(ICDE 2024).

  • FinePOSE:  https://github.com/PKU-ICST-MIPL/FinePOSE_CVPR2024
    FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models(CVPR 2024).

  • C2R:  https://github.com/PKU-ICST-MIPL/C2R_CVPR2024
    Learning Continual Compatible Representation for Re-indexing Free Lifelong Person Re-identification(CVPR 2024).

  • DMA:  https://github.com/PKU-ICST-MIPL/DMA_TIFS2023
    DMA: Dual Modality-Aware Alignment for Visible-Infrared Person Re-Identification(TIFS 2024).

  • Real20M:  https://github.com/PKU-ICST-MIPL/Real20M_ACMMM2023
    Real20M: A Large-scale E-commerce Dataset for Cross-domain Retrieval(ACM MM 2023).

  • HCL:  https://github.com/PKU-ICST-MIPL/HCL_TMM2023
    HCL: Hierarchical Consistency Learning for Webly Supervised Fine-Grained Recognition(TMM 2023).

  • LFR-GAN:  https://github.com/PKU-ICST-MIPL/LFR-GAN_TOMM2023
    LFR-GAN: Local Feature Refinement based Generative Adversarial Network for Text-to-Image Generation(TOMM 2023).

  • PosterLayout:  https://github.com/PKU-ICST-MIPL/PosterLayout-CVPR2023
    PosterLayout: A New Benchmark and Approach for Content-Aware Visual-Textual Presentation Layout(CVPR 2023).

  • DCR-ReID:  https://github.com/PKU-ICST-MIPL/DCR-ReID_TCSVT2023
    DCR-ReID: Deep Component Reconstruction for Cloth-Changing Person Re-Identification(TCSVT 2023).

  • MKVSE:  https://github.com/PKU-ICST-MIPL/MKVSE-TOMM2023
    MKVSE: Multimodal Knowledge Enhanced Visual-Semantic Embedding for Image-Text Retrieval(TOMM 2023).

  • SIM-Trans:  https://github.com/PKU-ICST-MIPL/SIM-TRANS_ACMMM2022
    SIM-Trans: Structure Information Modeling Transformer for Fine-grained Visual Categorization(ACM MM 2022).

  • MARS:  https://github.com/PKU-ICST-MIPL/MARS_TCSVT2021
    MARS: Learning Modality-Agnostic Representation for Scalable Cross-media Retrieval(TCSVT 2021).

  • UVCL:  https://github.com/PKU-ICST-MIPL/UVCL_TCYB2020
    Unsupervised Visual-textual Correlation Learning with Fine-grained Semantic Alignment(TCYB 2020).

  • WSDL:  https://github.com/PKU-ICST-MIPL/WSDL_TCSVT2019
    Fast Fine-grained Image Classification via Weakly Supervised Discriminative Localization(TCSVT 2019).

  • DASG:  https://github.com/PKU-ICST-MIPL/DASG_TCSVT2019
    Unsupervised Cross-media Retrieval Using Domain Adaptation with Scene Graph(TCSVT 2019).

  • DRLIH:  https://github.com/PKU-ICST-MIPL/DRLIH_TMM2020
    Deep Reinforcement Learning for Image Hashing(TMM 2020).

  • MCSCH:  https://github.com/PKU-ICST-MIPL/MCSCH_TOMM2019
    Sequential Cross-Modal Hashing Learning via Multi-scale Correlation Mining(TOMM 2019).

  • RCBT:  https://github.com/PKU-ICST-MIPL/RCBT_TCSVT2020
    Reinforced Cross-Media Correlation Learning by Context-Aware Bidirectional Translation(TCSVT 2020).

  • OSTG:  https://github.com/PKU-ICST-MIPL/OSTG_TIP2020
    Video Captioning with Object-Aware Spatio-Temporal Correlation and Aggregation(TIP 2020).

  • OA-BTG:  https://github.com/PKU-ICST-MIPL/OABTG_CVPR2019
    Object-aware Aggregation with Bidirectional Temporal Graph for Video Captioning(CVPR 2019).

  • AGHA:  https://github.com/PKU-ICST-MIPL/AGHA_MMM2019
    Hierarchical Vision-Language Alignment for Video Captioning(MMM 2019).

  • VHSM:  https://github.com/PKU-ICST-MIPL/VHSM_TCYB2020
    Visual-textual Hybrid Sequence Matching for Joint Reasoning(TCYB 2020).

  • MAVA:  https://github.com/PKU-ICST-MIPL/MAVA_TIP2020
    MAVA: Multi-level Adaptive Visual-textual Alignment by Cross-media Bi-attention Mechanism(TIP 2020).

  • HIL:  https://github.com/PKU-ICST-MIPL/HIL_TOMM2020
    HIL: Recognizing Cross-media Entailment with Heterogeneous Interactive Learning(TOMM 2020).

  • CDCR:  https://github.com/PKU-ICST-MIPL/CDCR_TCSVT2019
    CDCR: Quintuple-media Joint Correlation Learning with Deep Compression and Regularization(TCSVT 2019).

  • DFCL:  https://github.com/PKU-ICST-MIPL/DFCL_JOS2019
    DFCL: Cross-media Deep Fine-grained Correlation Learning(Journal of Software 2019).

  • Bridge-GAN:  https://github.com/PKU-ICST-MIPL/Bridge-GAN_TCSVT2019
    Bridge-GAN: Interpretable Representation Learning for Text-to-image Synthesis(TCSVT 2019).

  • CKD:  https://github.com/PKU-ICST-MIPL/CKD_TMM2019
    CKD: Cross-task Knowledge Distillation for Text-to-image Synthesis(TMM 2019).

  • CKRM:  https://github.com/PKU-ICST-MIPL/CKRM_TCSVT2020
    CKRM: Multi-level Knowledge Injecting for Visual Commonsense Reasoning(TCSVT 2020).

  • FGCrossNet:  https://github.com/PKU-ICST-MIPL/FGCrossNet_ACMMM2019
    FGCrossNet: A New Benchmark and Approach for Fine-grained Cross-media Retrieval(ACM MM 2019).

  • DADN:  https://github.com/PKU-ICST-MIPL/DADN_TCSVT2019
    DADN: Zero-shot Cross-media Embedding Learning with Dual Adversarial Distribution Network(TCSVT 2019).

  • MGAH:  https://github.com/PKU-ICST-MIPL/MGAH_TMM2019
    MGAH: Multi-pathway Generative Adversarial Hashing for Unsupervised Cross-modal Retrieval(TMM 2019).

  • TPCKT:  https://github.com/PKU-ICST-MIPL/TPCKT_TMM2019
    TPCKT: Two-level Progressive Cross-media Knowledge Transfer(TMM 2019).

  • CM-GANs:  https://github.com/PKU-ICST-MIPL/CM-GANS_TOMM2019
    CM-GANs: Cross-modal Generative Adversarial Networks for Common Representation Learning(TOMM 2019).

  • SSDH:  https://github.com/PKU-ICST-MIPL/SSDH_TCSVT2019
    SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval(TCSVT 2019).

  • DCKT:  https://github.com/PKU-ICST-MIPL/DCKT_CVPR2018
    Deep Cross-media Knowledge Transfer(CVPR 2018).

  • MHTN:  https://github.com/PKU-ICST-MIPL/MHTN_TCYB2018
    MHTN: Modal-adversarial Hybrid Transfer Network for Cross-modal Retrieval(TCYB 2018).

  • SCH-GAN:  https://github.com/PKU-ICST-MIPL/SCHGAN_TCYB2018
    SCH-GAN: Semi-supervised Cross-modal Hashing by Generative Adversarial Network(TCYB 2018).

  • MCSM:  https://github.com/PKU-ICST-MIPL/MCSM_TIP2018
    Modality-specific Cross-modal Similarity Measurement with Recurrent Attention Network(TIP 2018).

  • TCLSTA:  https://github.com/PKU-ICST-MIPL/TCLSTA_TCSVT2018
    Two-stream Collaborative Learning with Spatial-Temporal Attention for Video Classification(TCSVT 2018).

  • OPAM:  https://github.com/PKU-ICST-MIPL/OPAM_TIP2018
    Object-Part Attention Model for Fine-grained Image Classification(TIP 2018).

  • CCL:  https://github.com/PKU-ICST-MIPL/CCL_TMM2018
    CCL: Cross-modal Correlation Learning with Multi-grained Fusion by Hierarchical Network(TMM 2018).

  • QaDWH:  https://github.com/PKU-ICST-MIPL/QaDWH_TMM2018
    Query-adaptive Image Retrieval by Deep-Weighted Hashing(TMM 2018).

  • UGACH:  https://github.com/PKU-ICST-MIPL/UGACH_AAAI2018
    Unsupervised Generative Adversarial Cross-modal Hashing(AAAI 2018).

  • Saliency-guided-Faster-R-CNN:  https://github.com/PKU-ICST-MIPL/Saliency-guided-Faster-R-CNN_ACMMM2017
    Fine-grained Discriminative Localization via Saliency-guided Faster R-CNN(ACM MM 2017).

  • CHTN:  https://github.com/PKU-ICST-MIPL/CHTN_IJCAI2017
    Cross-modal Common Representation Learning by Hybrid Transfer Network(IJCAI 2017).

  • DPEP:  https://github.com/PKU-ICST-MIPL/DPEP
    Cross-media Retrieval by Exploiting Fine-Grained Correlation at Entity Level(Neurocomputing 2017).

  • CMDN:  https://github.com/PKU-ICST-MIPL/CMDN_IJCAI2016
    Cross-media Shared Representation by Hierarchical Learning with Multiple Deep Networks(IJCAI 2016).

  • S2UPG:  https://github.com/PKU-ICST-MIPL/S2UPG_TCSVT2016
    Semi-Supervised Cross-Media Feature Learning with Unified Patch Graph Regularization(TCSVT 2016).

  • JRL:  https://github.com/PKU-ICST-MIPL/JRL_TCSVT2014
    Learning Cross-Media Joint Representation with Sparse and Semisupervised Regularization(TCSVT 2014).

  • JGRHML:  https://github.com/PKU-ICST-MIPL/JGRHML_AAAI2013
    Heterogeneous Metric Learning with Joint Graph Regularization for Cross-Media Retrieval(AAAI 2013).

  • CMCP:  https://github.com/PKU-ICST-MIPL/CMCP_ICASSP2012
    Cross-Modality Correlation Propagation for Cross-Media Retrieval(ICASSP 2012).

  • HSNN:  https://github.com/PKU-ICST-MIPL/HSNN_MMM2012
    Effective Heterogeneous Similarity Measure with Nearest Neighbors for Cross-Media Retrieval(MMM 2012).

北京大学王选计算机研究所多媒体信息处理研究室