Mohammed Hamdy's picture

Open to Collab

Mohammed Hamdy

mmhamdy

hugging-science

·

https://surfingmanifolds.substack.com/

AI & ML interests

AI4Sci | NLP | Reinforcement Learning

Recent Activity

repliedto their post 4 days ago

Things rarely go as we expect! In 2017, Google released the Transformer architecture. While it was clear the model was promising, absolutely no one (including its authors) anticipated the pervasive global revolution it would create! The authors actually viewed the Transformer as just a stepping stone for a much more ambitious project: The MultiModel. Their ultimate goal was to build a single deep learning architecture capable of jointly learning massive, diverse tasks across entirely different domains (in 2017). A One Model To Learn Them All. In fact, the MultiModel paper was published in the exact same month as Attention Is All You Need! But history had other plans. The building block eclipsed the grand design! So, have you heard about the MultiModel before? 😀

posted an update 4 days ago

Things rarely go as we expect! In 2017, Google released the Transformer architecture. While it was clear the model was promising, absolutely no one (including its authors) anticipated the pervasive global revolution it would create! The authors actually viewed the Transformer as just a stepping stone for a much more ambitious project: The MultiModel. Their ultimate goal was to build a single deep learning architecture capable of jointly learning massive, diverse tasks across entirely different domains (in 2017). A One Model To Learn Them All. In fact, the MultiModel paper was published in the exact same month as Attention Is All You Need! But history had other plans. The building block eclipsed the grand design! So, have you heard about the MultiModel before? 😀

posted an update 5 months ago

The new DeepSeek Engram paper is super fun! It also integrates mHC, and I suspect they're probably releasing all these papers to make the V4 report of reasonable length😄 Here's a nice short summary from Gemini

View all activity

Organizations

liked a Space 7 months ago

Unlocking On-Policy Distillation for Any Model Family

Visualize on-policy distillation for any model family

liked a dataset 8 months ago

transferable-samplers/many-peptides-md

Updated Dec 15, 2025 • 26.9k • 8

liked 3 Spaces 8 months ago

Science Release Heatmap

Explore AI4Science contributions by organizations and tags

Maintain the unmaintainable

Explore the complex relationships between 400+ machine learning models

Transformers Timeline

Interactive timeline to explore the 🤗Transformers models

liked a model 10 months ago

rednote-hilab/dots.ocr

Image-Text-to-Text • 3B • Updated Oct 31, 2025 • 218k • 1.31k

liked a dataset 12 months ago

nvidia/Nemotron-Personas-USA

Viewer • Updated Dec 16, 2025 • 1M • 9.42k • 314

liked a model 12 months ago

PlayHT/PlayDiffusion

Updated Jul 29, 2025 • 111

liked 2 models about 1 year ago

facebook/KernelLLM

Text Generation • 8B • Updated Jan 15 • 133 • • 201

sesame/csm-1b

Text-to-Speech • 2B • Updated Dec 1, 2025 • 255k • 2.39k

liked a Space about 1 year ago

The Distill Template

Craft Beautiful Blogs

liked a model about 1 year ago

ElectricAlexis/NotaGen

Updated Feb 26, 2025 • 154

liked a model over 1 year ago

microsoft/wham

Updated Dec 17, 2025 • 151 • 270

liked a Space over 1 year ago

The Ultra-Scale Playbook

The ultimate guide to training LLM on large GPU Clusters

liked a model over 1 year ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated Apr 10, 2025 • 13.4M • • 6.25k

liked a dataset over 1 year ago

HuggingFaceH4/MATH-500

Viewer • Updated Dec 15, 2025 • 500 • 164k • 310

liked a model over 1 year ago

answerdotai/ModernBERT-base

Fill-Mask • 0.1B • Updated Jan 15, 2025 • 2.48M • 1.05k

liked a Space over 1 year ago

Scaling test-time compute

Boost LLM answers with flexible test‑time search strategies

liked a model over 1 year ago

CohereLabs/c4ai-command-r7b-12-2024

8B • Updated Oct 30, 2025 • 28.8k • 421

liked a dataset over 1 year ago

CohereLabs/Global-MMLU

Viewer • Updated Aug 14, 2025 • 602k • 35.9k • 159