Image-Text-to-Text
Transformers
Safetensors
PyTorch
mllama
facebook
meta
llama
llama-3
conversational
text-generation-inference
Instructions to use meta-llama/Llama-3.2-11B-Vision-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use meta-llama/Llama-3.2-11B-Vision-Instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="meta-llama/Llama-3.2-11B-Vision-Instruct") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("meta-llama/Llama-3.2-11B-Vision-Instruct") model = AutoModelForImageTextToText.from_pretrained("meta-llama/Llama-3.2-11B-Vision-Instruct") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use meta-llama/Llama-3.2-11B-Vision-Instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "meta-llama/Llama-3.2-11B-Vision-Instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "meta-llama/Llama-3.2-11B-Vision-Instruct", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/meta-llama/Llama-3.2-11B-Vision-Instruct
- SGLang
How to use meta-llama/Llama-3.2-11B-Vision-Instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "meta-llama/Llama-3.2-11B-Vision-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "meta-llama/Llama-3.2-11B-Vision-Instruct", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "meta-llama/Llama-3.2-11B-Vision-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "meta-llama/Llama-3.2-11B-Vision-Instruct", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use meta-llama/Llama-3.2-11B-Vision-Instruct with Docker Model Runner:
docker model run hf.co/meta-llama/Llama-3.2-11B-Vision-Instruct
Performance degrades with Transformers library update
#130 opened 25 days ago
by
bharanibala
Pending access request for Llama 3.2 Vision on Hugging Face
#129 opened 2 months ago
by
mlewannabe
🚩 Report: Copyright infringement
#128 opened 2 months ago
by
azabdev
Access rejected
#127 opened 3 months ago
by
SEUNGHUN12
Access rejected for vision models
#126 opened 4 months ago
by
nohavond
Access rejected
#125 opened 6 months ago
by
RemiNollet
Need to reset access request for Llama 3.2 Vision (wrong date of birth caused rejection)
#124 opened 8 months ago
by
words122
How to create a GGUF file for LLaMA 3.2 Vision model (MllamaForConditionalGeneration)?
#123 opened 9 months ago
by
Sravanthi2018
Request: DOI
#122 opened 9 months ago
by
HarshDoshi21
Sending same image concurrently yields different results
#121 opened 9 months ago
by
asaf-levy015
Unable to access model meta-llama/Llama-3.2-11B-Vision-Instruct. Please visit https://api.together.ai/models to view the list of supported models.
3
#120 opened 10 months ago
by
Kakashi-hatake
Access rejected
#119 opened 10 months ago
by
sofluff
Access to Llama 3.2 Vision models.
#118 opened 10 months ago
by
eshanc
Your request to access this repo has been rejected by the repo's authors.
#117 opened 11 months ago
by
AC4af
Unable to Access or Cancel Access Request for Meta’s Gated Repositories
#115 opened 12 months ago
by
dreamgonfly
Reporting model behaviour
#114 opened about 1 year ago
by
hqvo
local-llama
#113 opened about 1 year ago
by
ashwin15
llama-3.2-11B-vision_naveen
#112 opened about 1 year ago
by
ashwin15
Fine tuning of meta-llama/Llama-3.2-11B-Vision-Instruct
#111 opened about 1 year ago
by
Pragathii
bad performance
#110 opened about 1 year ago
by
sipie800
Request rejected
5
#109 opened about 1 year ago
by
lihcxr
Request to Access Rejected
1
#108 opened about 1 year ago
by deleted
Training dataset for llama vision instruct models
#107 opened about 1 year ago
by
polyn
HELP!!!! Repository access request rejected!!!
#106 opened about 1 year ago
by
farhan9801
FP8 quantization
#105 opened about 1 year ago
by
uyiosa
jupy
#104 opened about 1 year ago
by
MetaMateo82
meta-llama/Llama-3.2-11B-Vision is not the path to a directory containing a file named model-00003-of-00005.safetensors
#103 opened over 1 year ago
by
CKK0331
My access request was rejected without any explanation
3
#102 opened over 1 year ago
by
tarundave
Llama 3.2 vision-instruct access denied?
➕🤗 4
#101 opened over 1 year ago
by
massmass
The Default Snippet Error
#99 opened over 1 year ago
by
Infip
llama
#98 opened over 1 year ago
by
manthanpatel49
Access to llama-3.2-11B model request got reject
4
#95 opened over 1 year ago
by
TC14
The MMStar's test outcomes are falling short: 0.47
#93 opened over 1 year ago
by
DUNDUN2
Support for Images files present locally
3
#92 opened over 1 year ago
by
bigopot420
Can I add multiple images within a single prompt?
1
#91 opened over 1 year ago
by
Roner1
UnboundLocalError: cannot access local variable 'rows' where it is not associated with a value
#90 opened over 1 year ago
by
JoAmps42i
Can I apply 4-Bit Quantization on Llama-3.2-11B-Vision-Instruct using TGI?
#89 opened over 1 year ago
by
jaimin-at-work
The number of image token (0) should be the same as in the number of provided images (1)
2
#88 opened over 1 year ago
by
JR001
Your request to access this repo has been rejected by the repo's authors.
2
#87 opened over 1 year ago
by
mmkamani7
Using llama 3.2 Vision model deployed on VertexAI modelgarden for an text+image input.
➕ 3
#86 opened over 1 year ago
by
PrajwalM
How to use llama-3.2-11b-vision in vllm?
5
#85 opened over 1 year ago
by
WaltonFuture
Access to model meta-llama/Llama-3.2-11B-Vision-Instruct is restricted and you are not in the authorized list
#82 opened over 1 year ago
by
bearcool
France => Access request rejected
4
#81 opened over 1 year ago
by
FalconNet
Is LM Studio support the ‘mllama’ architecture?
6
#80 opened over 1 year ago
by
wilsonchou1996
Run Llama on videos
➕ 2
#79 opened over 1 year ago
by
LihiG
You need to agree to share your contact information to access this model
#78 opened over 1 year ago
by
mikerquan
Special tokens for Visual Grounding?
#77 opened over 1 year ago
by
echo-yi
Llama3.2-11B-Vision-Instruct does not remember the context when using with agents and sessions
#75 opened over 1 year ago
by
Irfanraza
Create llama 3.2 11b
#74 opened over 1 year ago
by
WongLen
SyntaxError: ':' expected after dictionary key
#73 opened over 1 year ago
by
jrp2024