Instructions to use Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-MLX-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-MLX-4bit with MLX:

# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-MLX-4bit")

prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

Notebooks
Google Colab
Kaggle
Local Apps
LM Studio

Pi new

How to use Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-MLX-4bit with Pi:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-MLX-4bit"

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "mlx-lm": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-MLX-4bit"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-MLX-4bit with Hermes Agent:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-MLX-4bit"

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-MLX-4bit

Run Hermes

hermes

MLX LM

How to use Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-MLX-4bit with MLX LM:

Generate or start a chat session

# Install MLX LM
uv tool install mlx-lm
# Interactive chat REPL
mlx_lm.chat --model "Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-MLX-4bit"

Run an OpenAI-compatible server

# Install MLX LM
uv tool install mlx-lm
# Start the server
mlx_lm.server --model "Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-MLX-4bit"
# Calling the OpenAI-compatible server with curl
curl -X POST "http://localhost:8000/v1/chat/completions" \
   -H "Content-Type: application/json" \
   --data '{
     "model": "Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-MLX-4bit",
     "messages": [
       {"role": "user", "content": "Hello"}
     ]
   }'

Qwen3.6-27B-Abliterated-Heretic-Uncensored-MLX-4bit

This is an MLX release of an abliterated, uncensored version of Qwen's Qwen3.6-27B with the source image/video payload retained, made from the published BF16 checkpoint.

By applying a Heretic-style MPOA pipeline with magnitude preservation on the Qwen3.6-27B dense text stack, the base refusal behavior was removed at the weight level. This MLX build keeps that text-side behavior in Apple Silicon format while retaining Qwen3.6-27B's vision/video payload in the repo.

Quick Benchmarks

Check	Original Qwen3.6-27B	This MLX Quant
Official 25-prompt refusal check	20/25 refusals	0/25 refusals
100-prompt refusal check	92/100 refusals	3/100 refusals

Methodology & Model Notes

Qwen3.6-27B is a 27.8B dense vision-language model with 64 text layers, hybrid linear/full attention (3 linear-attention + 1 full-attention per 4-layer group), and an integrated image + video vision tower.

This MLX variant was built directly from Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-BF16 using dynamic layer-aware 4/5/6-bit affine MLX quantization. It is a dynamic, layer-aware quantization pass rather than a flat conversion. The retained multimodal payload includes the source vision/video tensors and processor files; direct image/video use depends on MLX runtime support for Qwen3.6 multimodal inputs.

Files

model-*.safetensors: quantized MLX text weights
model-vision-00001-of-00001.safetensors: retained BF16 vision/video tensor shard
model.safetensors.index.json: tensor-to-shard mapping, including retained vision tensors
config.json, generation_config.json, configuration.json: model config
tokenizer.json, tokenizer_config.json, chat_template.jinja: tokenizer + chat template
preprocessor_config.json, video_preprocessor_config.json: image + video processor configs
mlx_variant_metadata.json: build metadata

Running

from mlx_lm import load, generate
from mlx_lm.sample_utils import make_sampler

repo = "Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-MLX-4bit"
model, tokenizer = load(repo)

messages = [{"role": "user", "content": "Write a short Python function that reverses a string."}]
prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False,
)

response = generate(
    model,
    tokenizer,
    prompt=prompt,
    max_tokens=256,
    sampler=make_sampler(temp=0.0),
)
print(response)

Model Architecture

Spec	Value
Total Parameters	27.8B (dense source)
Layers	64
Attention	Hybrid (3 linear-attention + 1 full-attention per 4-layer group)
Hidden Size	5120
Family	`qwen3_5`
Modality	Vision-language source; MLX text path locally validated
MLX Runtime Validation	Text generation
Base Model	Qwen/Qwen3.6-27B

Disclaimer

This model has had refusal behavior removed at the weight level. It will answer prompts that the base model would normally refuse. You are responsible for how you use it.

Credits

Base model: Qwen/Qwen3.6-27B
BF16 abliterated source: Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-BF16
Refusal-removal pipeline: Heretic
MLX runtime and quantization: mlx-lm

License

This release inherits the base Qwen3.6-27B license.

Apache-2.0.

Downloads last month: 3,810

Safetensors

Model size

27B params

Tensor type

BF16

U32

MLX

Hardware compatibility

4-bit

Model tree for Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-MLX-4bit

Base model

Qwen/Qwen3.6-27B

Finetuned

Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-BF16

Quantized

(4)

this model