当前位置:网站首页>Transformers vit image model vector acquisition

Transformers vit image model vector acquisition

2022-06-22 14:05:00 loong_ XL

Reference resources :https://huggingface.co/facebook/dino-vitb16

**** last_hidden_states 197 individual token; This is a 1616 Divide picture 224224, The latter is divided apart from the former 196 individual patch, Add one more transformer The first one added cls token, common 197 individual

from transformers import ViTFeatureExtractor, ViTModel
from PIL import Image
import requests

url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)

feature_extractor = ViTFeatureExtractor.from_pretrained('facebook/dino-vitb16')
model = ViTModel.from_pretrained('facebook/dino-vitb16')
inputs = feature_extractor(images=image, return_tensors="pt")
outputs = model(**inputs)
last_hidden_states = outputs.last_hidden_state

 Insert picture description here
*** To obtain cls token Vector embedding
 Insert picture description here

原网站

版权声明
本文为[loong_ XL]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/173/202206221245149745.html