--- base_model: Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-BF16 library_name: mlx pipeline_tag: text-generation license: apache-2.0 tags: - mlx - mlx-lm - qwen - qwen3.6 - qwen3_5 - dense - multimodal - vlm - vision - video - abliterated - uncensored - heretic - mpoa - apple-silicon - 4-bit - soma quantized_by: Youssofal --- # Qwen3.6-27B-Abliterated-Heretic-Uncensored-MLX-4bit This is an MLX release of an abliterated, uncensored version of Qwen's Qwen3.6-27B with the source image/video payload retained, made from the published BF16 checkpoint. By applying a Heretic-style MPOA pipeline with magnitude preservation on the Qwen3.6-27B dense text stack, the base refusal behavior was removed at the weight level. This MLX build keeps that text-side behavior in Apple Silicon format while retaining Qwen3.6-27B's vision/video payload in the repo. ## Quick Benchmarks | Check | Original Qwen3.6-27B | This MLX Quant | |---|---:|---:| | Official 25-prompt refusal check | 20/25 refusals | 0/25 refusals | | 100-prompt refusal check | 92/100 refusals | 3/100 refusals | ## Methodology & Model Notes Qwen3.6-27B is a 27.8B dense vision-language model with 64 text layers, hybrid linear/full attention (3 linear-attention + 1 full-attention per 4-layer group), and an integrated image + video vision tower. This MLX variant was built directly from `Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-BF16` using dynamic layer-aware 4/5/6-bit affine MLX quantization. It is a dynamic, layer-aware quantization pass rather than a flat conversion. The retained multimodal payload includes the source vision/video tensors and processor files; direct image/video use depends on MLX runtime support for Qwen3.6 multimodal inputs. ## Files - `model-*.safetensors`: quantized MLX text weights - `model-vision-00001-of-00001.safetensors`: retained BF16 vision/video tensor shard - `model.safetensors.index.json`: tensor-to-shard mapping, including retained vision tensors - `config.json`, `generation_config.json`, `configuration.json`: model config - `tokenizer.json`, `tokenizer_config.json`, `chat_template.jinja`: tokenizer + chat template - `preprocessor_config.json`, `video_preprocessor_config.json`: image + video processor configs - `mlx_variant_metadata.json`: build metadata ## Running ```python from mlx_lm import load, generate from mlx_lm.sample_utils import make_sampler repo = "Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-MLX-4bit" model, tokenizer = load(repo) messages = [{"role": "user", "content": "Write a short Python function that reverses a string."}] prompt = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, enable_thinking=False, ) response = generate( model, tokenizer, prompt=prompt, max_tokens=256, sampler=make_sampler(temp=0.0), ) print(response) ``` ## Model Architecture | Spec | Value | |---|---| | Total Parameters | 27.8B (dense source) | | Layers | 64 | | Attention | Hybrid (3 linear-attention + 1 full-attention per 4-layer group) | | Hidden Size | 5120 | | Family | `qwen3_5` | | Modality | Vision-language source; MLX text path locally validated | | MLX Runtime Validation | Text generation | | Base Model | Qwen/Qwen3.6-27B | ## Disclaimer This model has had refusal behavior removed at the weight level. It will answer prompts that the base model would normally refuse. You are responsible for how you use it. ## Credits - Base model: [Qwen/Qwen3.6-27B](https://huggingface.co/Qwen/Qwen3.6-27B) - BF16 abliterated source: [Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-BF16](https://huggingface.co/Youssofal/Qwen3.6-27B-Abliterated-Heretic-Uncensored-BF16) - Refusal-removal pipeline: [Heretic](https://github.com/andyrdt/heretic) - MLX runtime and quantization: [mlx-lm](https://github.com/ml-explore/mlx-lm) ## License This release inherits the base Qwen3.6-27B license. **Apache-2.0.**