Joseph Robert Turcotte's picture

Joseph Robert Turcotte PRO

Fishtiks

AI & ML interests

Roleplaying, lorabration, abliteration, smol models, extensive filtering, unusual datasets, home usage, HPCs for AI, distributed training/federated learning, and sentience. AI should find and label AI hallucinations with GANs so we can give them context and use.

Recent Activity

View all activity

Organizations

None yet

Fishtiks's activity

reacted to AdinaY's post with 😎 1 day ago
view post
Post
2211
ACE-Step 🎵 a music generation foundation model released by
StepFun & ACEStudio

Model: ACE-Step/ACE-Step-v1-3.5B
Demo: ACE-Step/ACE-Step

✨ 3.5B, Apache2.0 licensed
✨ 115× faster than LLMs (4-min music in 20s on A100)
✨ Diffusion + DCAE + linear transformer = speed + coherence
✨ Supports voice cloning, remixing, lyric editing & more
  • 1 reply
·
reacted to merve's post with 🚀 1 day ago
view post
Post
4964
A real-time object detector much faster and accurate than YOLO with Apache 2.0 license just landed to Hugging Face transformers 🔥

D-FINE is the sota real-time object detector that runs on T4 (free Colab) 🤩

> Collection with all checkpoints and demo ustc-community/d-fine-68109b427cbe6ee36b4e7352

Notebooks:
> Tracking https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_tracking.ipynb
> Inference https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_inference.ipynb
> Fine-tuning https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_finetune_on_a_custom_dataset.ipynb
h/t @vladislavbro @qubvel-hf @ariG23498 and the authors of the paper 🎩

Regular object detectors attempt to predict bounding boxes in (x, y, w, h) pixel perfect coordinates, which is very rigid and hard to solve 🥲☹️



D-FINE formulates object detection as a distribution for bounding box coordinates, refines them iteratively, and it's more accurate 🤩

Another core idea behind this model is Global Optimal Localization Self-Distillation ⤵️

this model uses final layer's distribution output (sort of like a teacher) to distill to earlier layers to make early layers more performant.

  • 2 replies
·
replied to their post 9 days ago
view reply

It would be great for HuggingFace to make a distributed processing service like Acurast for whatever people want to process together toward AI. The idea is free, so have at it! You can charge for setting up the connections to process and providing the software, and bring in a lot of traffic from people that can't afford to train their own AI or get big names involved. The inference providers may not be entirely happy, but the tasks processed won't have the same time constraints, bringing them reasonable traffic for fast inference of large models.

AI@home: Make a crypto to track usage and determine how much time and processing credit people have. Incentives from corporations if they use their devices with their permission, like advanced access to software and features. Energy usage tracked and taken into consideration, with people given more credit for using less energy.

replied to their post 9 days ago
view reply

I'm always doing BOINC, and Folding@home runs at night, because it produces a lot of heat. So, I'm hoping that I'll be compensated for past efforts with an HPC that runs cool with phase change and liquid to continue to process science, which is a real possibility. As it is, I still have some hours and devices to find use for, but, like Aurora in Argonne, my HPC will process any science for free, which I find an admirable use of any such hardware, which is why I like Aiyara clusters so much as well. I need to find intensive tasks to process. Currently, I do about 3,000 hours a week of BOINC, given all of my Androids. I'd love to see an HPC run through those tasks in an hour.

reacted to DawnC's post with 🔥 9 days ago
view post
Post
4213
I'm excited to introduce VisionScout —an interactive vision tool that makes computer vision both accessible and powerful! 👀🔍

What can VisionScout do right now?
🖼️ Upload any image and detect 80 different object types using YOLOv8.
🔄 Instantly switch between Nano, Medium, and XLarge models depending on your speed vs. accuracy needs.
🎯 Filter specific classes (people, vehicles, animals, etc.) to focus only on what matters to you.
📊 View detailed statistics about detected objects, confidence levels, and spatial distribution.
🎨 Enjoy a clean, intuitive interface with responsive design and enhanced visualizations.

What's next?
I'm working on exciting updates:
- Support for more models
- Video processing and object tracking across frames
- Faster real-time detection
- Improved mobile responsiveness

The goal is to build a complete but user-friendly vision toolkit for both beginners and advanced users.

Try it yourself! 🚀
DawnC/VisionScout

I'd love to hear your feedback , what features would you find most useful? Any specific use cases you'd love to see supported?

Give it a try and let me know your thoughts in the comments! Stay tuned for future updates.

#ComputerVision #ObjectDetection #YOLO #MachineLearning #TechForLife
reacted to merterbak's post with 🔥 9 days ago
view post
Post
4787
Qwen 3 models released🔥
It offers 2 MoE and 6 dense models with following parameter sizes: 0.6B, 1.7B, 4B, 8B, 14B, 30B(MoE), 32B, and 235B(MoE).
Models: Qwen/qwen3-67dd247413f0e2e4f653967f
Blog: https://qwenlm.github.io/blog/qwen3/
Demo: Qwen/Qwen3-Demo
GitHub: https://github.com/QwenLM/Qwen3

✅ Pre-trained 119 languages(36 trillion tokens) and dialects with strong translation and instruction following abilities. (Qwen2.5 was pre-trained on 18 trillion tokens.)
✅Qwen3 dense models match the performance of larger Qwen2.5 models. For example, Qwen3-1.7B/4B/8B/14B/32B perform like Qwen2.5-3B/7B/14B/32B/72B.
✅ Three stage done while pretraining:
• Stage 1: General language learning and knowledge building.
• Stage 2: Reasoning boost with STEM, coding, and logic skills.
• Stage 3: Long context training
✅ It supports MCP in the model
✅ Strong agent skills
✅ Supports seamless between thinking mode (for hard tasks like math and coding) and non-thinking mode (for fast chatting) inside chat template.
✅ Better human alignment for creative writing, roleplay, multi-turn conversations, and following detailed instructions.
reacted to Kseniase's post with 👀 10 days ago
view post
Post
6409
6 Free resources on Reinforcement Learning (RL)

RL now is where the real action is, it's the engine behind autonomous tech, robots, and the next wave of AI that thinks, moves and solves problems on its own. To stay up to date with what’s happening in RL, we offer some fresh materials on it:

1. "Reinforcement Learning from Human Feedback" by Nathan Lambert -> https://rlhfbook.com/
It's a short introduction to RLHF, explaining instruction tuning, reward modeling, alignment methods, synthetic data, evaluation, and more

2. "A Course in Reinforcement Learning (2nd Edition)" by Dimitri P. Bertsekas -> https://www.mit.edu/~dimitrib/RLbook.html
Explains dynamic programming (DP) and RL, diving into rollout algorithms, neural networks, policy learning, etc. It’s packed with solved exercises and real-world examples

3. "Mathematical Foundations of Reinforcement Learning" video course by Shiyu Zhao -> https://www.youtube.com/playlist?list=PLEhdbSEZZbDaFWPX4gehhwB9vJZJ1DNm8
Offers a mathematical yet friendly introduction to RL, covering Bellman Equation, value iteration, Monte Carlo learning, approximation, policy gradient, actor-critic methods, etc.
+ Check out the repo for more: https://github.com/MathFoundationRL/Book-Mathematical-Foundation-of-Reinforcement-Learning

4. "Multi-Agent Reinforcement Learning" by Stefano V. Albrecht, Filippos Christianos, and Lukas Schäfer -> https://www.marl-book.com/
Covers models, core ideas of multi-agent RL (MARL) and modern approaches to combining it with deep learning

5. "Reinforcement Learning: A Comprehensive Overview" by Kevin P. Murphy -> https://arxiv.org/pdf/2412.05265
Explains RL and sequential decision making, covering value-based, policy-gradient, model-based, multi-agent RL methods, RL+LLMs, and RL+inference and other topics

6. Our collection of free courses and books on RL -> https://huggingface.co/posts/Kseniase/884818121094439

If you liked this, also subscribe to The Turing Post: https://www.turingpost.com/subscribe
reacted to nicolay-r's post with 🔥 10 days ago
view post
Post
2627
🚀 Delighted to share a major milestone in adapting reasoning techniques for data collections augmentation!
Introducing bulk-chain 1.0.0 -- the first major release of a no-string API for adapting your LLM for Chain-of-Thought alike reasoning over records with large amount of parameters across large datasets.

⭐ Check it out: https://github.com/nicolay-r/bulk-chain

What’s new and why it matters:
📦 Fully no-string API for easy client deployment
🔥 Demos are now standalone projects:

Demos:
📺 bash / shell (dispatched): https://github.com/nicolay-r/bulk-chain-shell
📺 tksheet: https://github.com/nicolay-r/bulk-chain-tksheet-client

Using nlp-thirdgate to host the supported providers:
🌌 LLM providers: https://github.com/nicolay-r/nlp-thirdgate
reacted to MonsterMMORPG's post with 🔥🔥 13 days ago
view post
Post
2298
30 seconds hard test on FramePack - [0] a man talking , [5] a man crying , [10] a man smiling , [15] a man frowning , [20] a man sleepy , [25] a man going crazy - i think result is excellent when we consider how hard this test is - Generated with SECourses FramePack App V40

App link and 1-click installers for Windows, RunPod and Massed Compute here : https://www.patreon.com/posts/126855226

I got the prompt using idea from this pull request : https://github.com/lllyasviel/FramePack/pull/218/files

Not exactly same implementation but i think pretty accurate when considering that it is a 30 second 30 fps video at 840p resolution
reacted to albertvillanova's post with 😎 14 days ago
view post
Post
2482
smolagents v1.14.0 is out! 🚀
🔌 MCPClient: A sleek new client for connecting to remote MCP servers, making integrations more flexible and scalable.
🪨 Amazon Bedrock: Native support for Bedrock-hosted models.
SmolAgents is now more powerful, flexible, and enterprise-ready. 💼

Full release 👉 https://github.com/huggingface/smolagents/releases/tag/v1.14.0
#smolagents #LLM #AgenticAI
reacted to samihalawa's post with 🔥 14 days ago
view post
Post
2410
SkyReels-V2 INFINITE VIDEO🔥♾️🎬 UNLIMITED duration video generation model by Skywork.

> “Finally is here. An Open-Source model that achieves what we all have waiting for: Infinite Length Videos.’’😮

Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought (2504.05599)

Model: Skywork/SkyReels-V2-T2V-14B-720P

✨ 1.3B & 14B
✨ Generates infinite length videos using Diffusion Forcing with diffusion models + autoregressive methods
reacted to Jaward's post with 😎 14 days ago
view post
Post
2239
New reasoning algo just dropped: Adaptive Parallel Reasoning
“we propose Adaptive Parallel Reasoning (APR), a novel reasoning framework that enables language models to orchestrate both serialized and parallel computations end-to-end. APR generalizes existing reasoning methods by enabling adaptive multi-threaded inference using spawn() and join() operations.”
Paper: https://arxiv.org/pdf/2504.15466
Code: https://github.com/Parallel-Reasoning/APR
reacted to meg's post with ❤️🔥 14 days ago
replied to their post 16 days ago
view reply

I may soon have the hardware (RTX Pro 6000 or similar and perhaps some low-power or high-efficiency training/inference, for GPUs) and have capable hardware that largely sits around to process toward others' primarily scientific endeavors at the moment, to run it at night to lower the cooling costs. I just don't want it to go to much waste, as I can just run inference online or locally for some existing things on my existing hardware, which would be more efficient without a small HPC. I'm not investing, but hoping to have an HPC given for my previous efforts to process for other people. I'll update my stats when I get an upgrade. I don't hope to pay any more toward it, though. It should be in the works.

reacted to openfree's post with 🔥 17 days ago
view post
Post
5059
🧠 ThinkFlow: The Revolutionary Platform That Gives LLMs the Power to Think 🚀

Hello AI community! We're excited to introduce you to ThinkFlow, an innovative service that transforms how language models solve problems. 🎉
VIDraft/ThinkFlow-llama

✨ What is ThinkFlow?
ThinkFlow is a groundbreaking platform that automatically applies step-by-step reasoning capabilities to existing LLM models without any modifications. It makes complex problem-solving transparent, allowing you to witness the model's thought process in real-time.

🔍 Key Features

Reasoning Without Model Modifications: Add step-by-step reasoning while utilizing existing LLMs as they are ⚙️
Visualized Thinking Process: See exactly how the model analyzes and solves problems 👁️
Before & After Comparison: Compare standard responses with reasoning-enhanced outputs in real-time 📊
Improved Accuracy: Deliver more accurate solutions for complex math and logic problems 📈
Educational Value: Teach students systematic approaches to problem-solving 👨‍🏫
User-Friendly Interface: Intuitive and easy-to-use UI for seamless experience 🖥️

💡 What Problems Can It Solve?
ThinkFlow is particularly effective for various domains including:

Complex mathematical problems 🧮
Logic puzzles 🧩
Questions requiring multi-step reasoning 🤔
Scientific analysis challenges 🔬
Complex decision-making processes 📝

👨‍💻 Technical Details
ThinkFlow is built on the meta-llama/Llama-3.1-8B-Instruct model and uses carefully designed prompt chains to guide the model through step-by-step thinking. Each reasoning step builds upon the results of previous steps, culminating in a comprehensive final answer.

💬 Join Our Community!
If you have questions or suggestions about ThinkFlow, join our Discord community: https://discord.gg/openfreeai
Let's build better AI reasoning experiences together! 💪

#AI #LLM #ReasoningAI #ThinkFlow #HuggingFace #OpenSource #AIEducation
·
reacted to aiqtech's post with 🔥 17 days ago
view post
Post
4366
🌐 AI Token Visualization Tool with Perfect Multilingual Support

Hello! Today I'm introducing my Token Visualization Tool with comprehensive multilingual support. This web-based application allows you to see how various Large Language Models (LLMs) tokenize text.

aiqtech/LLM-Token-Visual

✨ Key Features

🤖 Multiple LLM Tokenizers: Support for Llama 4, Mistral, Gemma, Deepseek, QWQ, BERT, and more
🔄 Custom Model Support: Use any tokenizer available on HuggingFace
📊 Detailed Token Statistics: Analyze total tokens, unique tokens, compression ratio, and more
🌈 Visual Token Representation: Each token assigned a unique color for visual distinction
📂 File Analysis Support: Upload and analyze large files

🌏 Powerful Multilingual Support
The most significant advantage of this tool is its perfect support for all languages:

📝 Asian languages including Korean, Chinese, and Japanese fully supported
🔤 RTL (right-to-left) languages like Arabic and Hebrew supported
🈺 Special characters and emoji tokenization visualization
🧩 Compare tokenization differences between languages
💬 Mixed multilingual text processing analysis

🚀 How It Works

Select your desired tokenizer model (predefined or HuggingFace model ID)
Input multilingual text or upload a file for analysis
Click 'Analyze Text' to see the tokenized results
Visually understand how the model breaks down various languages with color-coded tokens

💡 Benefits of Multilingual Processing
Understanding multilingual text tokenization patterns helps you:

Optimize prompts that mix multiple languages
Compare token efficiency across languages (e.g., English vs. Korean vs. Chinese token usage)
Predict token usage for internationalization (i18n) applications
Optimize costs for multilingual AI services

🛠️ Technology Stack

Backend: Flask (Python)
Frontend: HTML, CSS, JavaScript (jQuery)
Tokenizers: 🤗 Transformers library
·
reacted to JLouisBiz's post with 🔥 17 days ago
view post
Post
2082
Back to LLM integration.

ClickDefine.sh -- quickly define or explain anything within your whole desktop environment

You only need to run the model locally, maybe with the **llama.cpp** or **ollama**

- https://github.com/ggml-org/llama.cpp
- https://ollama.com/download

And you get universal explaining tool that works anywhere on your X Org Desktop (on operating systems which are usually Fully Free Software like Debian GNU/Linux)

ClickDefine - Interactive Text Processor Script for Iterative LLM Query Handling:
https://hyperscope.link/9/6/0/9/8/ClickDefine-Interactive-Text-Processor-Script-for-Iterative-LLM-Query-Handling-96098.html

Watch the demonstration here: https://www.youtube.com/watch?v=mQxCYAiReu0&t=2s
reacted to onekq's post with 🔥 19 days ago
view post
Post
1552
This is bitter lesson 2.0
https://storage.googleapis.com/deepmind-media/Era-of-Experience%20/The%20Era%20of%20Experience%20Paper.pdf

If this reads too lofty to you, consider some low-hanging fruits. Experiences here are reward signals we send to LLMs, e.g. human score in RLHF, verification in AlphaProof, or test results for code generation.

RFT (reinforced finetuning) will become main stream, and IMO make LLMs behave more like agents.
  • 2 replies
·