# reinforcement-learning

このトピックのトレンドリポジトリ（7件）

AIモデルの実行も学習もブラウザ画面ひとつで完結！最大2倍速・VRAM70%削減の万能ツール — unsloth

unslothai/unslothAIPython

57.0k2回登場

Unslothは、Qwen、DeepSeek、Gemma、LlamaなどのオープンソースAIモデルを自分のパソコンで動かしたり、追加学習（ファインチューニング）したりできる統合ツールです。ブラウザから操作できるWeb画面（Unsloth S

agentdeepseekdeepseek-r1fine-tuninggemmagemma3gpt-ossllamallama3llmllmsmistralopenaiqwenqwen3reinforcement-learningtext-to-speechttsunslothvoice-cloning

rohitg00/ai-engineering-from-scratch

rohitg00/ai-engineering-from-scratchOtherPython

32.8k7回登場

Learn it. Build it. Ship it for others.

agentsaiai-agentsai-engineeringcomputer-visioncoursedeep-learningfrom-scratchgenerative-aillmmachine-learningmcpnlppythonreinforcement-learningrustswarm-intelligencetransformerstutorialtypescript

コード変更ほぼゼロでAIエージェントを強化学習で鍛える！どのフレームワークでもOK — agent-lightning

microsoft/agent-lightningAIPython

Agent Lightningは、AIエージェント（自律的にタスクをこなすAIプログラム）を強化学習（試行錯誤から学ぶ手法）で訓練するためのMicrosoft製ツールです。最大の特徴は、既存のAIエージェントのコードをほぼ変更せずに最適化で

agentagentic-aillmmlopsreinforcement-learning

owainlewis/awesome-artificial-intelligence

owainlewis/awesome-artificial-intelligenceOther

14.6k2回登場

A curated list of Artificial Intelligence (AI) courses, books, video lectures and papers.

aiartificial-intelligencedeep-learningintelligent-machinesintelligent-systemsmachine-intelligencemachine-learningneural-networkreinforcement-learningstatistical-learningunsupervised-learning

NVlabs/Sana

NVlabs/SanaOtherPython

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

diffusionditlinear-transformernvfp4pytorchreinforcement-learningsanasystem-algorithm-deisgntext-to-image-generationtext-to-videotransformersvideo-generation

HenryNdubuaku/maths-cs-ai-compendium

HenryNdubuaku/maths-cs-ai-compendiumOtherTypeScript

Become a cracked AI/ML Research Engineer

ai-textbookalgorithmsartificial-intelligencecomputer-sciencecomputer-visiondeep-learningjaxlinear-algebramachine-learningmachine-learning-algorithmsmathmathematicsmultimodal-learningnlpprobabilitypythonreinforcement-learningspeech-processingstatistics

AIの「考える力」を鍛える超高速トレーニングシステム — 非同期強化学習で推論モデルを進化させる — AReaL

inclusionAI/AReaLAIPython

AReaLは、AIモデルの「考える力（推論能力）」を強化学習（AIが試行錯誤しながら賢くなる手法）で鍛えるためのオープンソースのトレーニングシステムです。清華大学とアントグループが共同開発しており、完全非同期（複数の処理を待たずに同時並行で

agentllmllm-agentllm-reasoningmachine-learning-systemsmlsysreinforcement-learningrl