FlowReasoner is a new system that builds a custom set of small AI agents for every user question. Unlike search based methods it uses reasoning driven optimization with external execution feedback.
β First, it distills reasoning data using DeepSeek R1-671B to build multi agent systems. π€ β Then, reasoning data used for DeepSeek-R1-Distill-Qwen-7B via supervised fine tuning for basic reasoning skills. π‘ β Finally, RL with GRPO (optimizes by comparing response groups from queries/tasks) to improve reasoning.
FlowReasoner is a new system that builds a custom set of small AI agents for every user question. Unlike search based methods it uses reasoning driven optimization with external execution feedback.
β First, it distills reasoning data using DeepSeek R1-671B to build multi agent systems. π€ β Then, reasoning data used for DeepSeek-R1-Distill-Qwen-7B via supervised fine tuning for basic reasoning skills. π‘ β Finally, RL with GRPO (optimizes by comparing response groups from queries/tasks) to improve reasoning.
FlowReasoner is a new system that builds a custom set of small AI agents for every user question. Unlike search based methods it uses reasoning driven optimization with external execution feedback.
β First, it distills reasoning data using DeepSeek R1-671B to build multi agent systems. π€ β Then, reasoning data used for DeepSeek-R1-Distill-Qwen-7B via supervised fine tuning for basic reasoning skills. π‘ β Finally, RL with GRPO (optimizes by comparing response groups from queries/tasks) to improve reasoning.
Energy is a massive constraint for AI but do you even know what energy your chatGPT convos are using?
We're trying to change this by releasing ChatUI-energy, the first interface where you see in real-time what energy your AI conversations consume. Great work from @jdelavande powered by spaces & TGI, available for a dozen of open-source models like Llama, Mistral, Qwen, Gemma and more.