#Unicode

Dan 🌈's avatar
Dan 🌈

@[email protected]

Today I learned that there is a specific "record separator" symbol, formally known as "U+001E Information Separator Two".

codepoints.net/U+001E

It is meant to be used to indicate a separation between two units of information. An example of where this could be used is in a separated-value file, e.g. a CSV, but using this symbol instead of a comma.

This is interesting because there are vanishingly few instances where the record separator symbol would appear in most contexts, but many instances where a comma appears. Using this symbol instead of a comma (or a semi-colon, or an exclamation point, or any one of the usual separators) could make some data hygiene scenarios much more straightforward.

Dan 🌈's avatar
Dan 🌈

@[email protected]

Today I learned that there is a specific "record separator" symbol, formally known as "U+001E Information Separator Two".

codepoints.net/U+001E

It is meant to be used to indicate a separation between two units of information. An example of where this could be used is in a separated-value file, e.g. a CSV, but using this symbol instead of a comma.

This is interesting because there are vanishingly few instances where the record separator symbol would appear in most contexts, but many instances where a comma appears. Using this symbol instead of a comma (or a semi-colon, or an exclamation point, or any one of the usual separators) could make some data hygiene scenarios much more straightforward.

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

Axel ⌨🐧🐪🚴😷 | R.I.P Natenom's avatar
Axel ⌨🐧🐪🚴😷 | R.I.P Natenom

@[email protected]

based banning of browsers is sooooo lame.

$ lynx -useragent=🖕 https://[…]

SnoopJ's avatar
SnoopJ

@[email protected]

After a long period of quiet, I have released an update to the `unicode-age` package

pypi.org/project/unicode-age/

The package now supports 16.0

Matthias Wiesmann's avatar
Matthias Wiesmann

@[email protected]

Treasure Hunt – Braille Hints

So I prepared a treasure hunt for my older daughter, which involved some form of coded message. I found a braille table I could 3D-print, using a real system instead of some made-up code gave me the opportunity to explain how/why this was used in reality, you find braille codes in lifts, staircase handrails.

wiesmann.codiferes.net/wordpre

SnoopJ's avatar
SnoopJ

@[email protected]

TIL that the Consortium is working on guidance for detecting "URLs"¹ in text:

unicode.org/L2/L2024/24217r2-u

¹ scare quotes because URL is formally defined as ASCII-only, but "IRI" is a confusing term and everybody just wants to call the Unicode-aware equivalent a "URL"

Ausir's avatar
Ausir

@[email protected]

brand new combining diacritics dropping soon in Unicode 17, to be used for transcribing rare historical uses, and even more so for really tryhard conlangs!

Simon Tatham's avatar
Simon Tatham

@[email protected]

In the old days, you could change a letter between upper and lower case by XORing its character code with 0x20. Of course, if you tried this with anything that wasn't a letter, you'd get nonsense results.

If you try that with code points, it sometimes works, and sometimes doesn't. But Unicode can deliver much more impressive nonsense when it doesn't.

A fun example I just found: the "lower-case" version of CAR is NO PEDESTRIANS.

>>> chr(ord('🚗') ^ 0x20)
'🚷'

Paul McGuire's avatar
Paul McGuire

@[email protected]

Here are some emojidentifiers for your next Python code:

import math
乁_ツ_ㄏ = None
乁_益_ㄏ = math.nan

def minnums(values: list | 乁_ツ_ㄏ = 乁_ツ_ㄏ):
if (
values is 乁_ツ_ㄏ
or not all(isinstance(n, (float, int))
for n in values)
):
return 乁_益_ㄏ
return min(values)

SnoopJ's avatar
SnoopJ

@[email protected]

After a long period of quiet, I have released an update to the `unicode-age` package

pypi.org/project/unicode-age/

The package now supports 16.0

SnoopJ's avatar
SnoopJ

@[email protected]

After a long period of quiet, I have released an update to the `unicode-age` package

pypi.org/project/unicode-age/

The package now supports 16.0

Simon Tatham's avatar
Simon Tatham

@[email protected]

In the old days, you could change a letter between upper and lower case by XORing its character code with 0x20. Of course, if you tried this with anything that wasn't a letter, you'd get nonsense results.

If you try that with code points, it sometimes works, and sometimes doesn't. But Unicode can deliver much more impressive nonsense when it doesn't.

A fun example I just found: the "lower-case" version of CAR is NO PEDESTRIANS.

>>> chr(ord('🚗') ^ 0x20)
'🚷'

Simon Tatham's avatar
Simon Tatham

@[email protected]

In the old days, you could change a letter between upper and lower case by XORing its character code with 0x20. Of course, if you tried this with anything that wasn't a letter, you'd get nonsense results.

If you try that with code points, it sometimes works, and sometimes doesn't. But Unicode can deliver much more impressive nonsense when it doesn't.

A fun example I just found: the "lower-case" version of CAR is NO PEDESTRIANS.

>>> chr(ord('🚗') ^ 0x20)
'🚷'

Simon Tatham's avatar
Simon Tatham

@[email protected]

In the old days, you could change a letter between upper and lower case by XORing its character code with 0x20. Of course, if you tried this with anything that wasn't a letter, you'd get nonsense results.

If you try that with code points, it sometimes works, and sometimes doesn't. But Unicode can deliver much more impressive nonsense when it doesn't.

A fun example I just found: the "lower-case" version of CAR is NO PEDESTRIANS.

>>> chr(ord('🚗') ^ 0x20)
'🚷'

Simon Tatham's avatar
Simon Tatham

@[email protected]

In the old days, you could change a letter between upper and lower case by XORing its character code with 0x20. Of course, if you tried this with anything that wasn't a letter, you'd get nonsense results.

If you try that with code points, it sometimes works, and sometimes doesn't. But Unicode can deliver much more impressive nonsense when it doesn't.

A fun example I just found: the "lower-case" version of CAR is NO PEDESTRIANS.

>>> chr(ord('🚗') ^ 0x20)
'🚷'

Simon Tatham's avatar
Simon Tatham

@[email protected]

In the old days, you could change a letter between upper and lower case by XORing its character code with 0x20. Of course, if you tried this with anything that wasn't a letter, you'd get nonsense results.

If you try that with code points, it sometimes works, and sometimes doesn't. But Unicode can deliver much more impressive nonsense when it doesn't.

A fun example I just found: the "lower-case" version of CAR is NO PEDESTRIANS.

>>> chr(ord('🚗') ^ 0x20)
'🚷'

Simon Tatham's avatar
Simon Tatham

@[email protected]

In the old days, you could change a letter between upper and lower case by XORing its character code with 0x20. Of course, if you tried this with anything that wasn't a letter, you'd get nonsense results.

If you try that with code points, it sometimes works, and sometimes doesn't. But Unicode can deliver much more impressive nonsense when it doesn't.

A fun example I just found: the "lower-case" version of CAR is NO PEDESTRIANS.

>>> chr(ord('🚗') ^ 0x20)
'🚷'

Frank Meerkötter's avatar
Frank Meerkötter

@[email protected]

Love this book/comic the kids picked up from the library.

Frank Meerkötter's avatar
Frank Meerkötter

@[email protected]

Love this book/comic the kids picked up from the library.

Markus Redeker's avatar
Markus Redeker

@[email protected] · Reply to 0xDE's post

@11011110 At least these symbols have a meaning! But nobody knows what “Angzarr” (⍼) is and why it is in Unicode (en.wikipedia.org/wiki/Angzarr).

Markus Redeker's avatar
Markus Redeker

@[email protected] · Reply to 0xDE's post

@11011110 At least these symbols have a meaning! But nobody knows what “Angzarr” (⍼) is and why it is in Unicode (en.wikipedia.org/wiki/Angzarr).

Revath S Kumar :javascript:'s avatar
Revath S Kumar :javascript:

@[email protected] · Reply to Revath S Kumar :javascript:'s post

Wrote a small web utility to visualize the different string normalization forms of a text.

string-normalize.surge.sh/?str

Not the best design 😄 , but feedbacks are welcome.

desktop view of string normalize web page, showing NFC, NFD, NFKC and NFKD normalization forms of text "I ♥ Köln" is visible
ALT text detailsdesktop view of string normalize web page, showing NFC, NFD, NFKC and NFKD normalization forms of text "I ♥ Köln" is visible
mobile view of string normalize web page, showing NFC, NFD and NFKC normalization forms of text "I ♥ Köln" is visible
ALT text detailsmobile view of string normalize web page, showing NFC, NFD and NFKC normalization forms of text "I ♥ Köln" is visible
Michel Mariani's avatar
Michel Mariani

@[email protected]

New utility in Unicopedia Sinica:
- Pan-CJK Font Variants
(port from Unicopedia Plus, with Serif/明朝体 font style instead of Sans-Serif/ゴシック体)

🔗 codeberg.org/tonton-pixel/unic

Pan-CJK Font Variants utility screenshot
ALT text detailsPan-CJK Font Variants utility screenshot
Michel Mariani's avatar
Michel Mariani

@[email protected]

New utility in Unicopedia Plus:
- Unihan Phonetics

🔗 codeberg.org/tonton-pixel/unic

Unihan Phonetics utility screenshot
ALT text detailsUnihan Phonetics utility screenshot
Revath S Kumar :javascript:'s avatar
Revath S Kumar :javascript:

@[email protected] · Reply to Revath S Kumar :javascript:'s post

Wrote a small web utility to visualize the different string normalization forms of a text.

string-normalize.surge.sh/?str

Not the best design 😄 , but feedbacks are welcome.

desktop view of string normalize web page, showing NFC, NFD, NFKC and NFKD normalization forms of text "I ♥ Köln" is visible
ALT text detailsdesktop view of string normalize web page, showing NFC, NFD, NFKC and NFKD normalization forms of text "I ♥ Köln" is visible
mobile view of string normalize web page, showing NFC, NFD and NFKC normalization forms of text "I ♥ Köln" is visible
ALT text detailsmobile view of string normalize web page, showing NFC, NFD and NFKC normalization forms of text "I ♥ Köln" is visible
SnoopJ's avatar
SnoopJ

@[email protected]

have you ever "naturally" (i.e. not discussion among experts) encountered a font that correctly renders ꙮ?

OptionVoters
yes0 (0%)
no0 (0%)
what the hell are you talking about0 (0%)
Revath S Kumar :javascript:'s avatar
Revath S Kumar :javascript:

@[email protected]

New blog post : "JavaScript : understanding string normalize"

blog.revathskumar.com/2025/01/

:rss: Qiita - 人気の記事's avatar
:rss: Qiita - 人気の記事

@[email protected]

Unicode - 恩恵と厄介事
qiita.com/chai0917/items/16fa5

:rss: Qiita - 人気の記事's avatar
:rss: Qiita - 人気の記事

@[email protected]

[謹賀新年] 世界中に配置した Oracle Active Data Guard から新年のご挨拶
qiita.com/shirok/items/1da55c2

Paul McGuire's avatar
Paul McGuire

@[email protected] · Reply to Axel Rauschmayer's post

@rauschma Ah! I did something similar in Python - this is valid Python code:

def ℎ𝕖𝐥l𝙤():
try:
ℎ𝙚𝕝𝗹𝘰_ = "Hello"
w𝔬𝓇ˡ𝚍﹎ = "World"
𝖕𝘳𝒊𝖓𝑡(f"{𝗵𝒆𝘭𝓵𝚘﹍}, {𝑤º𝘳l𝑑︴}!")
except T𝗒ₚ𝕖E𝗿𝗋𝗈𝓻 as ᵉ𝒙ⅽ:
𝐩ᵣ𝚒𝖓𝓉("failed: {}".𝕗𝕠r𝑚𝖺𝘵(ⅇ𝔵𝚌))

if _︳n𝗮𝖒𝓮﹍︳ == "__main__":
h𝙚ⅼ𝐥𝕠()

ptmcg.pythonanywhere.com/font_

Scott Williams 🐧's avatar
Scott Williams 🐧

@[email protected]

"This coding interview is just going to be determining the human friendly length of a unicode utf-8 string."

Junior level dev: "Oh, this is going to be easy. How do they not know about len()?"

Senior level dev: "Oh, brilliant - a test of tolerance for pain by evaluating various code point chains with emoji, accents, and LTR/RTL markers. I'll start by writing some tests for 8-bit ord and char conversions with lookahead evals."

Python docs showing how the same one letter can count for one or two character lengths in unicode depending on the code point definition.
ALT text detailsPython docs showing how the same one letter can count for one or two character lengths in unicode depending on the code point definition.
Scott Williams 🐧's avatar
Scott Williams 🐧

@[email protected]

"This coding interview is just going to be determining the human friendly length of a unicode utf-8 string."

Junior level dev: "Oh, this is going to be easy. How do they not know about len()?"

Senior level dev: "Oh, brilliant - a test of tolerance for pain by evaluating various code point chains with emoji, accents, and LTR/RTL markers. I'll start by writing some tests for 8-bit ord and char conversions with lookahead evals."

Python docs showing how the same one letter can count for one or two character lengths in unicode depending on the code point definition.
ALT text detailsPython docs showing how the same one letter can count for one or two character lengths in unicode depending on the code point definition.
Scott Williams 🐧's avatar
Scott Williams 🐧

@[email protected]

"This coding interview is just going to be determining the human friendly length of a unicode utf-8 string."

Junior level dev: "Oh, this is going to be easy. How do they not know about len()?"

Senior level dev: "Oh, brilliant - a test of tolerance for pain by evaluating various code point chains with emoji, accents, and LTR/RTL markers. I'll start by writing some tests for 8-bit ord and char conversions with lookahead evals."

Python docs showing how the same one letter can count for one or two character lengths in unicode depending on the code point definition.
ALT text detailsPython docs showing how the same one letter can count for one or two character lengths in unicode depending on the code point definition.
SiljeLB's avatar
SiljeLB

@[email protected]

TIL that a proposal was made in 1997 to add to . I'm disappointed it hasn't been made official yet though. Here's a link to the proposal document: unicode.org/wg2/docs/n1641.pdf

omg! ubuntu's avatar
omg! ubuntu

@[email protected]

Ubuntu LTS users will shortly be able to see and use the 8 new emoji included in Unicode 16.0.

omgubuntu.co.uk/2024/12/ubuntu

Michel Mariani's avatar
Michel Mariani

@[email protected]

In the open-source application `Unicopedia Sinica`, both data files used for the `CJK Components` and the `CJK Related` utilities are now in a consistent JSON format with MIT license: `cjk-ids.json` and `cjk-related.json` respectively.

🔗 codeberg.org/tonton-pixel/unic

CJK Related utility screenshot
ALT text detailsCJK Related utility screenshot
CJK Components utility screenshot
ALT text detailsCJK Components utility screenshot
CJK Related utility screenshot
ALT text detailsCJK Related utility screenshot
SnoopJ's avatar
SnoopJ

@[email protected]

HUH, UAX#31 offers official guidance on hashtag identifiers, and I have somehow managed to miss that completely for several years (introduced along with Unicode 11.0 in 2018).

unicode.org/reports/tr31/#hash

It's not like I re-read the whole document regularly or anything but yea huh

Aaron “#e14n pro” Madlon-Kay's avatar
Aaron “#e14n pro” Madlon-Kay

@[email protected] · Reply to Aaron “#e14n pro” Madlon-Kay's post

iOS 18.2 did not add any new coverage, at least at the code point level. Nevertheless, I have updated

tofu.quest/?q=%F0%9F%A5%A8

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , @hollo, an ActivityPub-enabled microblogging software for single users, and @botkit, a simple ActivityPub bot framework.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (, )!

Eniko | Kitsune Tails out now!'s avatar
Eniko | Kitsune Tails out now!

@[email protected]

Btw here's a little unicode protip: unicode defines several character ranges as private use areas. You can map code points in these ranges to whatever glyph you want. This can be very handy for custom characters in your game that won't conflict with established unicode characters

In our games we use the PUA for keyboard and controller button glyphs

Ausir's avatar
Ausir

@[email protected]

brand new combining diacritics dropping soon in Unicode 17, to be used for transcribing rare historical uses, and even more so for really tryhard conlangs!

Michael Zöllner's avatar
Michael Zöllner

@[email protected]

My study "Unicode Spaces" will be published in Slanted Magazine - Experimental Type 3!

Listing of Unicode white space characters
ALT text detailsListing of Unicode white space characters
Steamboat Willy formed with whitespaces in text.
ALT text detailsSteamboat Willy formed with whitespaces in text.
Flower formed with whitespaces in text.
ALT text detailsFlower formed with whitespaces in text.
SnoopJ's avatar
SnoopJ

@[email protected]

TIL that the Consortium is working on guidance for detecting "URLs"¹ in text:

unicode.org/L2/L2024/24217r2-u

¹ scare quotes because URL is formally defined as ASCII-only, but "IRI" is a confusing term and everybody just wants to call the Unicode-aware equivalent a "URL"

Marcus Rohrmoser 🌻's avatar
Marcus Rohrmoser 🌻

@[email protected] · Reply to zirias (on snac)'s post

@zirias @stefano ​s are defined: unicode.org/reports/tr31/#D2

read 'em like this codeberg.org/seppo/seppo/src/c

Terence Eden's avatar
Terence Eden

@[email protected]

iOS 14 gets support for the Unicode Power Symbol!

shkspr.mobi/blog/2020/09/ios-1

Jim DeLaHunt

@[email protected] · Reply to Jim DeLaHunt's post

A cool change is that the Core Specification of the Unicode Standard is now released as a static HTML subsite, backed up by an archiveable of 1,140 pages.

unicode.org/versions/Unicode16

You can now link to specific sections and paragraphs, e.g.

"Unicode is about plain text, see: unicode.org/versions/Unicode16" .

I helped out in a small way with the project to produce the core spec as HTML + PDF. I think it is a marvellous improvement.

Jim DeLaHunt

@[email protected]

Yay! version 16.0 is released!

Announcement: blog.unicode.org/2024/09/annou

liilliil 🇫🇯🇱🇨's avatar
liilliil 🇫🇯🇱🇨

@[email protected]

Народ, айда форсить наш, славянский, кириллический !
«Три снежинки» — ⁂ — потенциальный повод для многочисленных подъёбок

Польские ребята (@brie) нашли лучшего кандидата — ꙮ, «серафим многꙮкий». Символ, найденный в 1928 году только в одной (!) рукописи, и только из-за этого (!) добавленный в несколько веков ждал своего часа
ru.wikipedia.org/wiki/Мультиок

(English version im-in.space/@liilliil/11302839 )

AmyFou 🥥🌴's avatar
AmyFou 🥥🌴

@[email protected]

I am a (non-tenure track, uni) interested in every single thing about , esp ones, & Side gig in ( lol). I love and will ask you too many questions about your etc . Proud fan. Love 👋

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected] · Reply to 洪 民憙 (Hong Minhee)'s post

こんにちは、私はソウルに住んでいる30代後半のオープンソースソフトウェアエンジニアで、自由・オープンソースソフトウェアとフェディバースの熱烈な支持者です。名前は洪 民憙(ホン・ミンヒ)です。

私はTypeScript用のActivityPubサーバーフレームワークである「@fedify」と、1人用フェディバースのマイクロブログである 「@hollo」の作成者でもあります。

私は東アジア言語(いわゆるCJK)とUnicodeにも興味が多いです。日本語、英語、韓国語で話しかけてください。(または、漢文でも!)

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Hello, I'm an open source software engineer in my late 30s living in , , and an avid advocate of and the .

I'm the creator of @fedify, an server framework in , and @hollo, a fediverse microblog for single users.

I'm also very interested in East Asian languages (so-called ) and . Feel free to talk to me in , (), or (), or even in Literary Chinese (/#漢文)!

Chunshek :FlyingToaster:'s avatar
Chunshek :FlyingToaster:

@[email protected]

post for my own Mastodon instance!

• I’m a 43-year-old jack-of-all-trades.
• I grew up in , lived in the . My partner of 14 years and I moved to in 2020.
• We are “parents” to one remaining dog.
• I have worked in journalism, finance, L&D, and now EdTech.
• I speak 6 , and have dabbled in many others.
• Things I will nerd out about: , , .
• I am a person of faith, but not a fan of organized religions.
• I type in .

A man happily holding a ripe yellow pineapple in his left hand, while pointing at the pineapple with his right hand, smiling at the camera.
ALT text detailsA man happily holding a ripe yellow pineapple in his left hand, while pointing at the pineapple with his right hand, smiling at the camera.
A man standing in front of a wall covered in dozens of containers of various types of instant ramen and udon noodles. The man's facial expression shows amusement.
ALT text detailsA man standing in front of a wall covered in dozens of containers of various types of instant ramen and udon noodles. The man's facial expression shows amusement.
A man kneels down next to two tilted mailboxes in Taipei, Taiwan, pretending to be carrying one of the mailboxes on his back.
ALT text detailsA man kneels down next to two tilted mailboxes in Taipei, Taiwan, pretending to be carrying one of the mailboxes on his back.
A top-down shot of a man lying down, looking into the eyes of a shiba inu dog. The dog has curled up into a resting position.
ALT text detailsA top-down shot of a man lying down, looking into the eyes of a shiba inu dog. The dog has curled up into a resting position.
洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected] · Reply to 洪 民憙 (Hong Minhee)'s post

If you believe that Chinese characters in , , and should all be divided into language-specific codes, then it is logical that the Latin characters in English, French, Italian, and German should all be divided into language-specific codes as well. Caveat: I don't believe so.

洪 民憙 (Hong Minhee)'s avatar
洪 民憙 (Hong Minhee)

@[email protected]

Well, I vote for Han unification of , and I rather think that more Chinese characters should have been unified (e.g., 高 & 髙, 產 & 産, 內 & 内). 🤷

xChaos's avatar
xChaos

@[email protected]

Nebaví vás googlit unicode znaky pro subscript a superscript? Mě už taky ne :-)

Akordy pro psaní horního a dolního indexu (ve smyslu Unicode) na klávesnici Windows se dají snadno vygooglit. Pod Linuxem je to ovšem trochu věda:

1) nejdřív Pravý alt + pravý shift + backspace + 2 (ano, čtyřhmat)
2) potom znak, který má být dolní index, třeba číslovka (což ovšem na české klávesnici, na kterou jste přepnutí, taky s shiftem, takže dvouhmat).

H₂O

Pro horní index ve stejném čtyřhmatu akorát nahradíte tu dvojku trojkou:

a² + b² = c²

Slušné akordy, ne? problém je, že pokud čtyřhmat nedomáčknete přesně (?) tak ten Backspace má tendenci fungovat jako backspace, takže umaže jeden znak... no zkrátka, dělám to pokaždé na několikátý pokus, zatím :-)

Vůbec jsem nepochopil návod
abclinuxu.cz/blog/kenyho_stesk
... asi proto, že nevím, která PC klávesa je "compose key", ale v komentářích čtenářů jsem si všiml návodu pro slovenskou klávesnici a funguje mi i pro český layout a tak to předávám dál.

SnoopJ's avatar
SnoopJ

@[email protected]

the most important part of history is when a mouse fell out of a light fixture and got added to the count of members present at a Technical Committee meeting (9 Nov 2016)

unicode.org/L2/L2016/16325.htm

Screenshot of meeting notes for UTC Meeting 149. Text reads:

Mouse now present. 6.502 members represented.

[149-A94] Action Item for Landlord: Capture and exile the mouse that just fell out of the light fixture.
ALT text detailsScreenshot of meeting notes for UTC Meeting 149. Text reads: Mouse now present. 6.502 members represented. [149-A94] Action Item for Landlord: Capture and exile the mouse that just fell out of the light fixture.
Nemo_bis 🌈's avatar
Nemo_bis 🌈

@[email protected]

Re-: recurring topics here.

.net

1/4

Michel Mariani's avatar
Michel Mariani

@[email protected] · Reply to Design Brouhaha's post

@MoritzBrouhaha

Je viens tout juste d'acquérir les cinq premiers numéros d’Unicode à Gogo ! Tous disponibles à la boutique du Musée de l'Imprimerie et de la Communication graphique.

Excellent ! 💮

Les cinq premiers numéros d’Unicode à Gogo !
ALT text detailsLes cinq premiers numéros d’Unicode à Gogo !
Michel Mariani's avatar
Michel Mariani

@[email protected]

Unicopedia Ægypta is a developer-oriented set of utilities related to Egyptian hieroglyphs, wrapped into one single app, built with .

Repository: 🔗 codeberg.org/tonton-pixel/unic

Unicopedia Ægypta Social Preview
ALT text detailsUnicopedia Ægypta Social Preview
Matthias Wiesmann's avatar
Matthias Wiesmann

@[email protected]

Treasure Hunt – Braille Hints

So I prepared a treasure hunt for my older daughter, which involved some form of coded message. I found a braille table I could 3D-print, using a real system instead of some made-up code gave me the opportunity to explain how/why this was used in reality, you find braille codes in lifts, staircase handrails.

wiesmann.codiferes.net/wordpre

Michel Mariani's avatar
Michel Mariani

@[email protected]

Unicopedia Plus is a developer-oriented set of Unicode, Unihan, Unikemet & emoji utilities wrapped into one single app, built with .

Repository: 🔗 codeberg.org/tonton-pixel/unic

Unicopedia Plus Social Preview
ALT text detailsUnicopedia Plus Social Preview
Michel Mariani's avatar
Michel Mariani

@[email protected]

Unicopedia Sinica is a developer-oriented set of utilities related to ideographs, wrapped into one single app, built with .

Repository: 🔗 codeberg.org/tonton-pixel/unic

Unicopedia Sinica Social Preview
ALT text detailsUnicopedia Sinica Social Preview
꧁ᐊ𰻞ᵕ̣̣̣̣̣̣́́♛ᵕ̣̣̣̣̣̣́́𰻞ᐅ꧂'s avatar
꧁ᐊ𰻞ᵕ̣̣̣̣̣̣́́♛ᵕ̣̣̣̣̣̣́́𰻞ᐅ꧂

@[email protected]

New 2d numeral system just dropped‽‽‽

It's based on ᚛ᚑᚌᚐᚋ᚜ & ☯ & bijective base 6, & works left→right or left←right

Aaron “#e14n pro” Madlon-Kay's avatar
Aaron “#e14n pro” Madlon-Kay

@[email protected]

Newly covered code points in 17.0:

ᜍ᜕ᜟ

My tooling also indicated that these are covered, but they don't actually show up on my iPhone:

􀑝

Gerrit Imsieke's avatar
Gerrit Imsieke

@[email protected]

Formatting people’s names correctly in a given context, for a given purpose, is hard. International linguists recently helped update the Common Locale Data Repository (). It will help programmers display person names correctly in many settings.
Mike McKenna wrote about it in “A Story Teller’s Case Study: Unlocking the Power of CLDR Person Name Formatting – A Solution for Formatting Names in a Globalized World” unicode.org/media/CLDR_Person_