Instructions to use Phind/Phind-CodeLlama-34B-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Phind/Phind-CodeLlama-34B-v2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Phind/Phind-CodeLlama-34B-v2")

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("Phind/Phind-CodeLlama-34B-v2")
model = AutoModelForMultimodalLM.from_pretrained("Phind/Phind-CodeLlama-34B-v2")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Phind/Phind-CodeLlama-34B-v2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Phind/Phind-CodeLlama-34B-v2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Phind/Phind-CodeLlama-34B-v2",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Phind/Phind-CodeLlama-34B-v2

SGLang

How to use Phind/Phind-CodeLlama-34B-v2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Phind/Phind-CodeLlama-34B-v2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Phind/Phind-CodeLlama-34B-v2",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Phind/Phind-CodeLlama-34B-v2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Phind/Phind-CodeLlama-34B-v2",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Phind/Phind-CodeLlama-34B-v2 with Docker Model Runner:
```
docker model run hf.co/Phind/Phind-CodeLlama-34B-v2
```

Updated prompt template

#31

by karacayir - opened Dec 4, 2023

base: refs/heads/main

←

from: refs/pr/31

Discussion Files changed

-1

Updated prompt templatee5dc71e1

karacayir

Dec 4, 2023

In the course of utilizing this model, the initial output in model generations includes the 'Response' text. It is inferred that the model has undergone training with the ### Assistant Response tag for completion. The proposed modification has been implemented successfully, yielding the intended results.

Meenakshi04

Jan 3, 2024

Meenakshi04

Jan 3, 2024

what is the breed of dog

jukofyork

Feb 15, 2024

•

edited Feb 15, 2024

I actually got this model to give up his training prompt. I won't detail how, but it's the same way as I used for miqu (see: https://huggingface.co/miqudev/miqu-1-70b/discussions/25). Basically using a completely blank template: I first asked about what he saw to do with "###" and then asked if he saw anything before the first "###" and so on...

It turns out why he keeps saying "Response" is because he was training with this:

{System Prompt}

### Instruction:
{Prompt}

### Response:
{Response}

Or using Ollama this template:

TEMPLATE """{{ if and .First .System }}{{ .System }}

{{ end }}### Instruction:
{{ .Prompt }}

### Response:
{{ .Response }}"""

He doesn't seem to have such a huge improvement as miqu got from using the correct prompt but it can't hurt to use the correct format.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment