|
 Andre Polykanine A.K.A. Menelion Elensúlë - 2014-08-04 19:57:20
And still not a word about native and good unicode support, as far as I can see. Performance is great, however without unicode support it drops the level of the language as a whole very low. Personally I like PHP, that's why it's a double pain for me.
 Manuel Lemos - 2014-08-04 20:08:39 - In reply to message 1 from Andre Polykanine A.K.A. Menelion Elensúlë
Nothing has been commented about Unicode at least for PHP 7. Maybe in PHP 8 somebody brave faces that problem again.
I remember Rasmus mentioning they may have a go at it in the future using a simpler library than ICU, but that is all I can remember.
 Joeri Sebrechts - 2014-08-06 07:30:13 - In reply to message 1 from Andre Polykanine A.K.A. Menelion Elensúlë
To be fair, you don't need it. PHP basically has the same level of unicode support as C and C++. The built-in strings are byte-arrays, and you can use ICU (intl extension), iconv or mbstring to deal with them as unicode in places where you care that one character != one byte.
Admittedly, it is annoying that sort() can't actually sort UTF-8 properly on windows machines but with the Collator class in intl you now have a cross-platform sorting solution, so the gaps have been filled.
So, yeah, it's a bit awkward to work with unicode, and you need to know what you're doing, but there is nothing missing to handle unicode absolutely perfectly. See this presentation I made which explains how to work with strings in PHP: http://sebrechts.net/slides/strings/
 Andre Polykanine A.K.A. Menelion Elensúlë - 2014-08-06 20:12:37 - In reply to message 3 from Joeri Sebrechts
Of course I use mbstring. However I believe it's slower than if there would be native Unicode support in the language core. Am I wrong?
 Manuel Lemos - 2014-08-06 20:37:34 - In reply to message 4 from Andre Polykanine A.K.A. Menelion Elensúlë
I think any multi-byte text encoding manipulation is slower than the regular single byte encoding text manipulation.
If I am not mistaken, the original PHP 6 plans were using UTF-16 to manipulate all text strings.
This means that single-byte text would be slower to manipulate than what we have today. That could hurt PHP speed in general.
So I am afraid the transparent Unicode support that some developers desire, comes at a price, of either speed and memory usage.
|