Skip to content

List of papers wrote by Focoos AI research team!

License

Notifications You must be signed in to change notification settings

FocoosAI/papers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Papers List

This repository lists papers authored by Focoos AI.

2024

Title Venue Code
📜 PEM: Prototype-based Efficient MaskFormer for Image Segmentation
Niccolò Cavagnero, Gabriele Rosi, Claudia Cuttano, Francesca Pistilli, Marco Ciccone, Giuseppe Averta, Fabio Cermelli

Prototype-based Efficient MaskFormer (PEM) is a transformer-based architecture for image segmentation that improves efficiency without sacrificing performance. It uses prototype-based cross-attention and a multi-scale feature pyramid network to reduce computation. PEM outperforms task-specific models while being more computationally efficient.
CVPR 2024 🌐
Project Page

GitHub stars
📜 The Revenge of BiSeNet: Efficient Multi-Task Image Segmentation
Gabriele Rosi, Claudia Cuttano, Niccolò Cavagnero, Giuseppe Averta, Fabio Cermelli

BiSeNetFormer is a multi-task image segmentation architecture designed for efficiency and accuracy, supporting semantic and panoptic segmentation. It combines two-stream architectures with a transformer-based segmentation head, achieving high inference speeds and competitive accuracy on datasets like Cityscapes and ADE20K.
CVPR 2024 (Workshop) -
📜 What does CLIP know about peeling a banana?
Claudia Cuttano, Gabriele Rosi, Gabriele Trivigno, Giuseppe Averta

AffordanceCLIP leverages pre-trained Vision-Language models like CLIP to improve affordance segmentation for robots, bypassing the need for costly annotations or predefined actions. It achieves competitive zero-shot performance, works with any action prompt, and requires minimal additional training, enabling scalable, flexible models.
CVPR 2024 (Workshop) -
📜 SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation
Claudia Cuttano, Gabriele Trivigno, Gabriele Rosi, Carlo Masone, Giuseppe Averta

SAMWISE is a Referring Video Object Segmentation (RVOS) method that overcomes limitations of previous models by enabling streaming processing while retaining context. Built on the Segment-Anything 2 (SAM2) model, it integrates natural language understanding and temporal modeling, achieving state-of-the-art performance with minimal overhead.
📝 Under submission GitHub stars

Feel free to explore the papers and reach out for collaborations or inquiries!

About

List of papers wrote by Focoos AI research team!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •