
Lionel F. Lebeau - 2016-04-17 22:42:04 -
In reply to message 3 from Maier Karl
What if the text is not encoded in UTF-8 but in another multibyte encoding ?
For example, I often have to work with japanese encoded texts (EUC-JP or Shift JIS).
So, I think that using mb_strtolower(), although slower, is better than having to detect the encoding ^_^
But, of course, if you know you'll have only UTF-8 for multibyte text, you can replace the array_map() call ^_^
I use a much more complicated class to index chapters published on my main website. Some have more than 10000 words. It is launched by a cron job because it can be long (there are always several chapters to index).
In fact, if it is very slow, you could split the big text in smaller chunks and launch two or more sub routines in parallel.