、
(U+3001) and subsequent ASCII token should be breakable into two lines.
。
(U+3002) has the same problem as well.
For example, in Japanese writing, は、abc
should be breakable into two lines like:
は、 abc
Because Japanese people use 、
and 。
just like comma and period in English.
We can break a line after comma or period in English.
But the current Unicode line breaking algorithm doesn't allow this behavior for 、
and 。
.
I think it's a problem of the Unicode line breaking algorithm standard.
See Unicode Standard Annex #14 Line Breaking Properties.
CL: Closing Punctuation (XB)
3001..3002 IDEOGRAPHIC COMMA..IDEOGRAPHIC FULL STOP
、
(U+3001) and 。
(U+3002) are specified as CL characters.
LB30: Do not break between letters, numbers, or ordinary symbols and opening or closing punctuation.CL × (AL | NU)
It says CL and a subsequent alphabetic or numeric token is not breakable.
In the result we cannot break at any positions in は、abc
.
In my opinion 、
and 。
should not be treated as CL.
Because we cannot apply the LB30 rule to them.
In conclusion they should be considered as a different class.
WebKit uses ICU for line breaking. ICU has a very strict implementation of the Unicode line breaking algorithm. Therefore this problem is reproducible in Safari.