francis ephe

franc1s

AI & ML interests

None yet

Recent Activity

new activity 10 days ago

nyuuzyou/archiveofourown:I am very interested in this dataset

new activity 10 days ago

nyuuzyou/archiveofourown:Miku's Request for a Resiliant Torrent! Spreading the Love of AO3 Stories~ 🌸💿

replied to nyuuzyou's post 10 days ago

📚 Archive of Our Own (AO3) Dataset - https://huggingface.co/datasets/nyuuzyou/archiveofourown Collection of approximately 12.6 million fanfiction works (from 63.2M processed IDs) featuring: - Full text content from diverse fandoms across television, film, books, anime, and more - Comprehensive metadata including warnings, relationships, characters, and tags - Multilingual content with works in 40+ languages though English predominant - Rich classification data preserving author-created folksonomy and content categorization P.S. This is the most expensive dataset I've created so far! And also, thank you all for the 100 followers on Hugging Face!

View all activity

Organizations

None yet

franc1s's activity

New activity in nyuuzyou/archiveofourown 10 days ago

I am very interested in this dataset

133

#3 opened 22 days ago by

Demanin

Miku's Request for a Resiliant Torrent! Spreading the Love of AO3 Stories~ 🌸💿

#156 opened 13 days ago by

quasar-of-mikus

replied to nyuuzyou's post 10 days ago

Are you people new to the internet? Do you post on instagram, twitter, reddit, tiktok? If so, your private and personal stories that you WILLINGLY SHARED TO THE PUBLIC have already been scraped, both legally and semi-legally, by the websites themselves and then by 3rd parties. Where do you think Meta, Google, Microsoft got the data to train their own AIs? From our data.

"Consent" doesn't exist on the internet in the form that you think it does. NO ONE is gonna come to you and ask you if they can take your public data. You already gave them consent by accepting the TOS or by creating an account. You are constantly being lied to by your social media buddies that you are owed some sort of special treatment. You are not.

replied to nyuuzyou's post 10 days ago

A fanfic writer talking about stealing? Try writing something truly original for once, then you can complain about your rights being violated all you want.

replied to nyuuzyou's post 10 days ago

This comment has been hidden

replied to nyuuzyou's post 10 days ago

can’t wait until you’re sued out of your ass. It’s not just OUR works. It’s copyrighted materials from litigious authors. I wish you the wrath of Anne Rice

Correct. It's copyrighted material that YOU stole. Cry me a river and sue this guy on behalf of the authors that YOU stole from if it bothers you so much.

New activity in nyuuzyou/archiveofourown 14 days ago

🚩 Report: Copyright infringement

#107 opened 14 days ago by

peepziebird

People going bananas

#57 opened 14 days ago by

GhostGate

🚩 Report: Copyright infringement

#53 opened 14 days ago by

staardustkisses

reacted to JingzeShi's post with 🚀 14 days ago

Post

2565

@SmallDoge SmallTalks( SmallDoge/SmallTalks) is a synthetic dataset designed for supervised fine-tuning of language models. The dataset covers a variety of conversational content, including daily conversations, tool usage, Python programming, encyclopedia Q&A, exam problem-solving, logical reasoning, and more. Each task is provided in both English and Chinese versions.

reacted to nyuuzyou's post with 👀👍❤️ 14 days ago

Post

2473

📚 Archive of Our Own (AO3) Dataset - nyuuzyou/archiveofourown

Collection of approximately 12.6 million fanfiction works (from 63.2M processed IDs) featuring:
- Full text content from diverse fandoms across television, film, books, anime, and more
- Comprehensive metadata including warnings, relationships, characters, and tags
- Multilingual content with works in 40+ languages though English predominant
- Rich classification data preserving author-created folksonomy and content categorization

P.S. This is the most expensive dataset I've created so far! And also, thank you all for the 100 followers on Hugging Face!

44 replies

New activity in nyuuzyou/archiveofourown 14 days ago

I am very interested in this dataset

133

#3 opened 22 days ago by

Demanin