A Personal Journal of Learning and Discovery

Tag: deepseek

3 items with this tag.

  • Jan 06, 2026

    20251116093417a⁝ DeepSeek

    • ai
    • china
    • deepseek
    • model
  • Jan 30, 2025

    Reinforcement Learning with GRPO Fine-Tuning a Small Language Model for Chain-of-Thought Math Reasoning. Similar to Deepseek R1 training

    • llm
    • coding
    • training
    • rl
    • deepseek
  • Jan 21, 2025

    Deepseek R1 note

    • ai
    • deepseek
    • models
    • reasoning
    • Total
    • Activated

  • GitHub