Opening parentheses are counted from left to right (starting from 1) to obtain the number of the capturing subpattern.To use backslash in replacement, it must be doubled ( PHP string).An error I happen to stumple about quite often was the back-tracking-limit: When working with HTML-documents and their parsing it happens that you encounter documents that have a length of over 100.000 characters and that may lead to certain regular-expressions to fail due the back-tracking-limit of above.A regular-expression that is ungreedy ("U", often does the job, but still: sometimes you just need a greedy regular expression working on long strings ...Match 1 or 0 times Match exactly n times Match at least n times Match at least n but not more than m times More Special Character Stuff\t tab (HT, TAB)\n newline (LF, NL)\r return (CR)\f form feed (FF)\a alarm (bell) (BEL)\e escape (think troff) (ESC)3 octal char (think of a PDP-11)\x1B hex char\c[ control char\l lowercase next char (think vi)\u uppercase next char (think vi)\L lowercase till \E (think vi)\U uppercase till \E (think vi)\E end case modification (think vi)\Q quote (disable) pattern metacharacters till \EEven More Special Characters\w Match a "word" character (alphanumeric plus "_")\W Match a non-word character\s Match a whitespace character\S Match a non-whitespace character\d Match a digit character\D Match a non-digit character\b Match a word boundary\B Match a non-(word boundary)\A Match only at beginning of string\Z Match only at end of string, or before newline at the end\z Match only at end of string\G Match only where previous m//g left off (works only with /g) If you want to catch characters, as well european, russian, chinese, japanese, korean of whatever, just :- use mb_internal_encoding('UTF-8');- use preg_replace('`...`u', '...', $string) with the u (unicode) modifier For further information, the complete list of preg_* modifiers could be found at : seems to be some unexpected behavior when using the /m modifier when the line terminators are win32 or mac format. Try preg_replace (and other preg-functions) return null instead of a string when encountering problems you probably did not think about!If you have a string like below, and try to replace dots, the regex won't replace correctly: This code must convert numeric html entities to utf8. It treats wrong codes starting with The reason is that code2utf will be called with leading zero, exactly what the pattern matches - code2utf(039). -------------------------It may not be obvious to everybody that the function returns NULL if an error of any kind occurres.Hello there, I would like to share a regex (PHP) sniplet of code I wrote (2012) for myself it is also being used in the Yerico sriptmerge plugin for joomla marked as simple code..
The output goes in an unexpected direction in case your input contains two double preg_replace('[^A-Za-z0-9_]', '', 'D"usseldorfer H"auptstrasse')D"usseldorfer H"auptstrasse It is important to not forget a leading an trailing forward slash in the regex: echo preg_replace('/[^A-Za-z0-9_]/', '', 'D"usseldorfer H"auptstrasse')Dusseldorfer Hauptstrasse PS An alternative is to use preg_replace('/\W/', '', $t) for keeping all alpha numeric characters including underscores.
It just adds a space before any uppercase letter in the string.
Be aware that when using the "/u" modifier, if your input text contains any bad UTF-8 code sequences, then preg_replace will return an empty string, regardless of whether there were any matches.
If you surrounded your backreference by single-quotes, the double-quotes are corrupt:$text = str_replace('\"', '"', $text); People using preg_replace with /e should at least be aware of this.
I'm not sure how it would be best fixed in preg_replace.