seems_utf8() WordPress Function
The seems_utf8() function is used to check if a string seems to be valid UTF-8. This function is useful for checking if a string is valid UTF-8 before passing it to other functions that expect UTF-8.
seems_utf8( string $str ) #
Checks to see if a string is utf8 encoded.
Description
NOTE: This function checks for 5-Byte sequences, UTF8 has Bytes Sequences with a maximum length of 4.
Parameters
- $str
(string)(Required)The string to be checked
Return
(bool) True if $str fits a UTF-8 model, false otherwise.
Source
File: wp-includes/formatting.php
function seems_utf8( $str ) { mbstring_binary_safe_encoding(); $length = strlen( $str ); reset_mbstring_encoding(); for ( $i = 0; $i < $length; $i++ ) { $c = ord( $str[ $i ] ); if ( $c < 0x80 ) { $n = 0; // 0bbbbbbb } elseif ( ( $c & 0xE0 ) == 0xC0 ) { $n = 1; // 110bbbbb } elseif ( ( $c & 0xF0 ) == 0xE0 ) { $n = 2; // 1110bbbb } elseif ( ( $c & 0xF8 ) == 0xF0 ) { $n = 3; // 11110bbb } elseif ( ( $c & 0xFC ) == 0xF8 ) { $n = 4; // 111110bb } elseif ( ( $c & 0xFE ) == 0xFC ) { $n = 5; // 1111110b } else { return false; // Does not match any model. } for ( $j = 0; $j < $n; $j++ ) { // n bytes matching 10bbbbbb follow ? if ( ( ++$i == $length ) || ( ( ord( $str[ $i ] ) & 0xC0 ) != 0x80 ) ) { return false; } } } return true; }
Expand full source codeCollapse full source codeView on TracView on GitHub
Changelog
Version | Description |
---|---|
1.2.1 | Introduced. |