File size: 895 Bytes
5fa1a76 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
For example, in the above diagram, to return the feature map from the first stage of the Swin backbone, you can set out_indices=(1,): from transformers import AutoImageProcessor, AutoBackbone import torch from PIL import Image import requests url = "http://images.cocodataset.org/val2017/000000039769.jpg" image = Image.open(requests.get(url, stream=True).raw) processor = AutoImageProcessor.from_pretrained("microsoft/swin-tiny-patch4-window7-224") model = AutoBackbone.from_pretrained("microsoft/swin-tiny-patch4-window7-224", out_indices=(1,)) inputs = processor(image, return_tensors="pt") outputs = model(**inputs) feature_maps = outputs.feature_maps Now you can access the feature_maps object from the first stage of the backbone: list(feature_maps[0].shape) [1, 96, 56, 56] AutoFeatureExtractor For audio tasks, a feature extractor processes the audio signal the correct input format. |