28 11 79

Martin Viewegger

Viewegger

AI & ML interests

None yet

Recent Activity

liked a model 1 day ago

renderartist/rubberhose-ruckus-hidream

liked a model 7 days ago

tori29umai/FramePack_LoRA

new activity 12 days ago

nari-labs/Dia-1.6B:Are you going to release the training code as well (either pretraining or training from scratch)?

View all activity

Organizations

None yet

Viewegger's activity

liked a model 1 day ago

renderartist/rubberhose-ruckus-hidream

Text-to-Image • Updated about 22 hours ago • 10 • 2

liked a model 7 days ago

tori29umai/FramePack_LoRA

Updated about 13 hours ago • 8

New activity in nari-labs/Dia-1.6B 12 days ago

Are you going to release the training code as well (either pretraining or training from scratch)?

#4 opened 16 days ago by

arpitsh018

Multiple languages support

#19 opened 14 days ago by

thunder-007

reacted to merve's post with 🔥 14 days ago

Post

3364

New foundation model on image and video captioning just dropped by NVIDIA AI 🔥

Describe Anything Model (DAM) is a 3B vision language model to generate detailed captions with localized references 😮

The team released the models, the dataset, a new benchmark and a demo 🤩 nvidia/describe-anything-680825bb8f5e41ff0785834c

Most of the vision LMs focus on image as a whole, lacking localized references in captions, and not taking in visual prompts (points, boxes, drawings around objects)

DAM addresses this on two levels: new vision backbone that takes in focal crops and the image itself, and a large scale dataset 👀

They generate a dataset by extending existing segmentation and referring expression generation datasets like REFCOCO, by passing in the images and classes to VLMs and generating captions.

Lastly, they also release a new benchmark again with self-supervision, they use an LLM to evaluate the detailed captions focusing on localization 👏

New activity in mahwizzzz/orpheus-urdu-tts 16 days ago

Finetuning hyperparameters and final training loss

#1 opened 16 days ago by

Viewegger

New activity in kadirnar/Orpheus-TTS-Starrail 17 days ago

Speaker reference from dataset

#2 opened 17 days ago by

Viewegger

liked a model 20 days ago

lllyasviel/FramePackI2V_HY

Updated 24 days ago • 391k • 102

New activity in kadirnar/Orpheus-TTS-Starrail 21 days ago

Starrail dataset language?

#1 opened 28 days ago by

Viewegger

liked 2 models 3 months ago

yentinglin/Mistral-Small-24B-Instruct-2501-reasoning

Text Generation • Updated 18 days ago • 163 • 56

cognitivecomputations/Dolphin3.0-R1-Mistral-24B

Text Generation • Updated 12 days ago • 1.63k • 188

liked 2 models 4 months ago

ostris/Flex.1-alpha

Text-to-Image • Updated Jan 19 • 31.2k • 447

drewThomasson/fineTunedTTSModels

Updated about 1 month ago • 6

reacted to alibabasglab's post with 👍 4 months ago

Post

1242

Introducing open-sourced ClearerVoice-Studio. A powerful speech processing AI tool to dramatically improve your speech quality. Checkout demo page: alibabasglab/ClearVoice and https://modelscope.cn/studios/iic/ClearerVoice-Studio. Give us a Star on Github: https://github.com/modelscope/ClearerVoice-Studio!

reacted to alibabasglab's post with 👍 4 months ago

Post

5318

🎉 ClearerVoice-Studio New Feature: Speech Super-Resolution with MossFormer2 ! 🚀
We’re excited to announce that ClearerVoice-Studio now supports speech super-resolution, powered by our latest MossFormer2-based model!
What’s New?

🔊 Convert Low-Resolution to High-Resolution Audio:
Transform low-resolution audio (effective sampling rate ≥ 16 kHz) into crystal-clear, high-resolution audio at 48 kHz.

🤖 Cutting-Edge Technology:
Leverages the MossFormer2 model plus HiFi-GAN, optimised for generating high-quality audio with enhanced perceptual clarity.

🎧 Enhanced Listening Experience:
Perfect for speech enhancement, content restoration, and high-fidelity audio applications.

🌟 Try It Out!
Upgrade to the latest version of ClearerVoice-Studio (https://github.com/modelscope/ClearerVoice-Studio) to experience this powerful feature. Check out the updated documentation and examples in our repository.

Let us know your thoughts, feedback, or feature requests in the Issues section.

liked a model 4 months ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated 27 days ago • 1.87M • • 4.23k

liked 2 models 5 months ago

strangerzonehf/Flux-C7-Sketch-LoRA

Text-to-Image • Updated Dec 20, 2024 • 9 • • 7

strangerzonehf/Flux-Sketch-Smudge-LoRA

Text-to-Image • Updated Dec 20, 2024 • 917 • • 43

liked 2 models 6 months ago

unsloth/Qwen2.5-Coder-32B-Instruct-128K-GGUF

Updated Nov 15, 2024 • 2.74k • 66

NexaAIDev/OmniVLM-968M

Updated Dec 17, 2024 • 2.06k • 517