Alyona Vert's picture

42

Alyona Vert

alyona0l

·

AI & ML interests

None yet

Recent Activity

reacted to Kseniase's post with 🔥 2 days ago

10 new Chain-of-Thoughts (CoT) methods CoT has long been one of the hottest techniques in AI thanks to its effectiveness and compelling core idea: encouraging models to solve complex problems through explicit intermediate reasoning steps. But usually researchers modify original CoT approach, finding tips that further improve LLMs' reasoning. That's what we're going to talk about today. Here's a list of 10 latest enhanced CoT approaches: 1. Chain-of-Defensive-Thought -> https://huggingface.co/papers/2504.20769 Provides a few structured, defensive reasoning exemplars to improve the robustness of LLMs 2. Hybrid-CoT -> https://huggingface.co/papers/2504.21659 Proposes using Adaptive Hybrid Reasoning Model (AdaR1) that combines Long- and Short-CoT, and applying bi-level preference training to select effective reasoning styles 3. Semantic-level and token-level CoT -> https://huggingface.co/papers/2505.00703 Introduces T2I-R1 text-to-image gen model, that uses semantic-level CoT for prompt planning and token-level CoT for pixel-level generation, while BiCoT-GRPO coordinates them both 4. Speculative CoT (SCoT) -> https://huggingface.co/papers/2504.19095 SCoT drafts multiple reasoning paths with a lightweight draft, selects the best, and uses the target model for correction - all this to reduce latency by 48–66% 5. Collaborative CoT (Co-CoT) -> https://huggingface.co/papers/2504.17091 Breaks reasoning into blocks that users can inspect, modify and re-run, promoting active engagement. An adaptation mechanism aligns outputs with diverse cognitive styles and user goals 6. XS-CoT -> https://huggingface.co/papers/2504.20835 It's a cross-lingual framework that integrates speech-to-text translation into reasoning, using a semi-implicit CoT approach to compress intermediate tokens. This improves non-core language responses by up to 45% Read further in the comments 👇 If you liked this, also subscribe to the Turing Post -> https://www.turingpost.com/subscribe

reacted to Kseniase's post with 👀 10 days ago

6 Free resources on Reinforcement Learning (RL) RL now is where the real action is, it's the engine behind autonomous tech, robots, and the next wave of AI that thinks, moves and solves problems on its own. To stay up to date with what’s happening in RL, we offer some fresh materials on it: 1. "Reinforcement Learning from Human Feedback" by Nathan Lambert -> https://rlhfbook.com/ It's a short introduction to RLHF, explaining instruction tuning, reward modeling, alignment methods, synthetic data, evaluation, and more 2. "A Course in Reinforcement Learning (2nd Edition)" by Dimitri P. Bertsekas -> https://www.mit.edu/~dimitrib/RLbook.html Explains dynamic programming (DP) and RL, diving into rollout algorithms, neural networks, policy learning, etc. It’s packed with solved exercises and real-world examples 3. "Mathematical Foundations of Reinforcement Learning" video course by Shiyu Zhao -> https://www.youtube.com/playlist?list=PLEhdbSEZZbDaFWPX4gehhwB9vJZJ1DNm8 Offers a mathematical yet friendly introduction to RL, covering Bellman Equation, value iteration, Monte Carlo learning, approximation, policy gradient, actor-critic methods, etc. + Check out the repo for more: https://github.com/MathFoundationRL/Book-Mathematical-Foundation-of-Reinforcement-Learning 4. "Multi-Agent Reinforcement Learning" by Stefano V. Albrecht, Filippos Christianos, and Lukas Schäfer -> https://www.marl-book.com/ Covers models, core ideas of multi-agent RL (MARL) and modern approaches to combining it with deep learning 5. "Reinforcement Learning: A Comprehensive Overview" by Kevin P. Murphy -> https://arxiv.org/pdf/2412.05265 Explains RL and sequential decision making, covering value-based, policy-gradient, model-based, multi-agent RL methods, RL+LLMs, and RL+inference and other topics 6. Our collection of free courses and books on RL -> https://huggingface.co/posts/Kseniase/884818121094439 If you liked this, also subscribe to The Turing Post: https://www.turingpost.com/subscribe

View all activity

Organizations

alyona0l's activity

No public activity