Get Mystery Box with random crypto!

OWL-ViT is a zero-shot text-conditioned object detection model | Tensorflow(@CVision)

OWL-ViT is a zero-shot text-conditioned object detection model that allows querying images with text descriptions of unseen objects. It has impressive generalization capabilities and is on par with some of the state-of-the-art object detection models.

Docs: https://huggingface.co/docs/transformers/main/en/model_doc/owlvit
Demo: https://huggingface.co/spaces/adirik/OWL-ViT
Tutorial: https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/zero