Overview
EgoLife is an ambitious egocentric AI project designed to capture multimodal daily activities of individuals over a week using state-of-the-art technology. By utilizing Meta Aria glasses, synchronized third-person cameras, and mmWave sensors, EgoLife provides a rich dataset for long-term video understanding.
Key Features
- EgoGPT: An omni-modal vision-language model that performs continuous video captioning, extracting key events, actions, and context from first-person video and audio streams.
- EgoRAG: A retrieval-augmented generation module that enables long-term reasoning and memory reconstruction, retrieving relevant past events and synthesizing contextualized answers to user queries.
Benefits
This project advances real-world egocentric AI applications like memory support, habit tracking, event recall, and task management, making it invaluable for research in AI and daily life assistance.
Highlights
- Latest innovations include a HuggingFace demo, video releases, and a new dataset, EgoIT-99K.
- Ongoing involvement in academia with contributions to papers and conferences, such as CVPR 2025.