francis ephe

franc1s
Β·

AI & ML interests

None yet

Recent Activity

Organizations

None yet

franc1s's activity

replied to nyuuzyou's post 10 days ago
view reply

Are you people new to the internet? Do you post on instagram, twitter, reddit, tiktok? If so, your private and personal stories that you WILLINGLY SHARED TO THE PUBLIC have already been scraped, both legally and semi-legally, by the websites themselves and then by 3rd parties. Where do you think Meta, Google, Microsoft got the data to train their own AIs? From our data.

"Consent" doesn't exist on the internet in the form that you think it does. NO ONE is gonna come to you and ask you if they can take your public data. You already gave them consent by accepting the TOS or by creating an account. You are constantly being lied to by your social media buddies that you are owed some sort of special treatment. You are not.

replied to nyuuzyou's post 10 days ago
view reply

A fanfic writer talking about stealing? Try writing something truly original for once, then you can complain about your rights being violated all you want.

replied to nyuuzyou's post 10 days ago
replied to nyuuzyou's post 10 days ago
view reply

can’t wait until you’re sued out of your ass. It’s not just OUR works. It’s copyrighted materials from litigious authors. I wish you the wrath of Anne Rice

Correct. It's copyrighted material that YOU stole. Cry me a river and sue this guy on behalf of the authors that YOU stole from if it bothers you so much.

reacted to JingzeShi's post with πŸš€ 14 days ago
view post
Post
2565
@SmallDoge SmallTalks( SmallDoge/SmallTalks) is a synthetic dataset designed for supervised fine-tuning of language models. The dataset covers a variety of conversational content, including daily conversations, tool usage, Python programming, encyclopedia Q&A, exam problem-solving, logical reasoning, and more. Each task is provided in both English and Chinese versions.
reacted to nyuuzyou's post with πŸ‘€πŸ‘β€οΈ 14 days ago
view post
Post
2473
πŸ“š Archive of Our Own (AO3) Dataset - nyuuzyou/archiveofourown

Collection of approximately 12.6 million fanfiction works (from 63.2M processed IDs) featuring:
- Full text content from diverse fandoms across television, film, books, anime, and more
- Comprehensive metadata including warnings, relationships, characters, and tags
- Multilingual content with works in 40+ languages though English predominant
- Rich classification data preserving author-created folksonomy and content categorization

P.S. This is the most expensive dataset I've created so far! And also, thank you all for the 100 followers on Hugging Face!
Β·
New activity in nyuuzyou/archiveofourown 14 days ago