[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: UTF-8 testing
- From: Javier Guerra Giraldez <javier@...>
- Date: Fri, 7 Jan 2011 09:55:05 -0500
On Fri, Jan 7, 2011 at 9:30 AM, Tony Finch <[email protected]> wrote:
> That's incorrect. Codepoints in UTF-8 can be at most 4 octets long.
Unicode is defined at 32bit at most (i think), but UTF-8 needs more
that 4 octets to encode 32 bits. UTF-8 is defined up to 6 octets (5
'trailing' bytes on this snippet)
--
Javier