utf-8 strings

Recommend this page to a friend!

utf-8 strings

Subject:	utf-8 strings
Summary:	don't works correct
Messages:	4
Author:	Maier Karl
Date:	2016-04-17 12:39:07

1. utf-8 strings

Report abuse

Maier Karl - 2016-04-17 12:39:08

strtolower doesn't wotks with utf8 strings.
for example: German language �o� ....

2. Re: utf-8 strings

Report abuse

Lionel F. Lebeau - 2016-04-17 15:00:19 - In reply to message 1 from Maier Karl

Your are right Karl.
I modified the two filter function to take it in account.
I just had to replace strtolower with mb_strtolower.

3. Re: utf-8 strings

Report abuse

Maier Karl - 2016-04-17 17:22:05 - In reply to message 2 from Lionel F. Lebeau

i founds out mb_strtolower needs a lot of time, the class goes than to low performance.

i my script i usw following.

if (mb_detect_encoding ($ele) == 'UTF-8')
$ele = mb_strtolower($ele,'UTF-8');
else
$ele = strtolower($ele);

the best way, i think write a internal function like above and call this than in array_map.

Best regards

4. Re: utf-8 strings

Report abuse

Lionel F. Lebeau - 2016-04-17 22:42:04 - In reply to message 3 from Maier Karl

What if the text is not encoded in UTF-8 but in another multibyte encoding ?
For example, I often have to work with japanese encoded texts (EUC-JP or Shift JIS).
So, I think that using mb_strtolower(), although slower, is better than having to detect the encoding ^_^

But, of course, if you know you'll have only UTF-8 for multibyte text, you can replace the array_map() call ^_^

I use a much more complicated class to index chapters published on my main website. Some have more than 10000 words. It is launched by a cron job because it can be long (there are always several chapters to index).
In fact, if it is very slow, you could split the big text in smaller chunks and launch two or more sub routines in parallel.

About us

Advertise on this site

For more information send a message to info at phpclasses dot org.