neochar / README.md
lqume's picture
Update README.md
9f01a90 verified

A newer version of the Gradio SDK is available: 5.29.0

Upgrade
metadata
title: Neochar
emoji: 🖼
colorFrom: purple
colorTo: red
sdk: gradio
sdk_version: 5.25.0
app_file: app.py
pinned: false
license: openrail
short_description: Unwritten Chinese Charecters in Style

What is this?

Generate New Characters by combining parts in creative ways. Write them in a controlled style.

Why

  • Fun to generate valid but unseen characters. (Never in a dictionary, nor Unicode).
  • Implements Lin Yutang's ideas with generative AI/ML, without the mechanical marvel :-/ or limitations :-)
  • Extends a font to support new charsets, and beyond to non-existent chars.
  • Adds variation/diversity/personality to generated images. No boring duplicates from the same char.
  • Other Creative Uses

How to use this app

  • Combine components or radicals in the following way
  • Specify the 'Structure' and 'Components', in a Polish Notation fashion - Good for tree structures
    • ⿰: 'LR' Left-Rigth
    • ⿱: 'TB' Top-Bottom
    • ⿸: 'TL' Top-Left
    • ⿹: 'TR' Top-Right
    • ⿺: 'BL' Bottom-Left
    • ⿴: 'OI' Outer-Inner
    • ⿻: 'OV' Overlap
    • ⿲: 'LMR' Left-Middle-Right
    • ⿳: 'TMB' Top-Middle-Bottom
    • ⿵: 'BT' Bottom Open Enclosure
    • ⿶: 'CT' Top Open Enclosure
    • ⿷: 'RT' Right Open Enclosure
  • Select a 'Style' by clicking the sample images
  • Hit the 'Generate' button
  • Repeat

Usage Tips

  • Simple structures work best (⿰ ⿱ ⿴ etc.)

  • "Known radicals at seen positions" work best (釒on left better than right, but may also surprise you in a good way)

  • Noto font family (sans and serif) gives the best results, as there are many training examples

  • Cursive and handwritten styles usually give good results, as they are more tolerant

  • Fonts supporting less chars are challenging

    • Current model was trained with 300k samples for only 20 epochs
    • Training will continue if this app gets attention or likes
  • For dictionary chars, decompose first.

  • For a part hard to describe, or you don't care, use a wildcard '?' (full-width question mark, or does it matter?)

  • What to do when the results are not as expected

    • Pick a different 'sytle' which may have trained the model better
    • Try again with a different random seed. This will change the overall structure in an unpredictable way
    • Try again with a different 'step' number. This will change the local details in a continuous way

Creative Uses

Turning a bug into a feature

When you see a funny result you didn't expect (5 or 3 dots while it should be 4), don't throw it away immediately.

  • Save the results to confuse/train OCR
  • 3vade 3vil c3nsorship
  • Share in discussion. The input text/seed/step will reliably reproduce the result.

Future Features

  • Typewriter keyboard for hard-to-input radicals, filtered by pinyin prefix
  • Direct generation from a single char, auto decomposition