utf8_uri_encode() WordPress Function

The utf8_uri_encode() function is used to encode a URL to UTF-8 format. This is useful when you need to pass a URL to a function that only accepts UTF-8 encoded strings.

utf8_uri_encode( string $utf8_string, int $length, bool $encode_ascii_characters = false ) #

Encodes the Unicode values to be used in the URI.


Parameters

$utf8_string

(string)(Required)String to encode.

$length

(int)(Required)Max length of the string

$encode_ascii_characters

(bool)(Optional)Whether to encode ascii characters such as < " '

Default value: false


Top ↑

Return

(string) String with Unicode encoded for URI.


Top ↑

Source

File: wp-includes/formatting.php

function utf8_uri_encode( $utf8_string, $length = 0, $encode_ascii_characters = false ) {
	$unicode        = '';
	$values         = array();
	$num_octets     = 1;
	$unicode_length = 0;

	mbstring_binary_safe_encoding();
	$string_length = strlen( $utf8_string );
	reset_mbstring_encoding();

	for ( $i = 0; $i < $string_length; $i++ ) {

		$value = ord( $utf8_string[ $i ] );

		if ( $value < 128 ) {
			$char                = chr( $value );
			$encoded_char        = $encode_ascii_characters ? rawurlencode( $char ) : $char;
			$encoded_char_length = strlen( $encoded_char );
			if ( $length && ( $unicode_length + $encoded_char_length ) > $length ) {
				break;
			}
			$unicode        .= $encoded_char;
			$unicode_length += $encoded_char_length;
		} else {
			if ( count( $values ) == 0 ) {
				if ( $value < 224 ) {
					$num_octets = 2;
				} elseif ( $value < 240 ) {
					$num_octets = 3;
				} else {
					$num_octets = 4;
				}
			}

			$values[] = $value;

			if ( $length && ( $unicode_length + ( $num_octets * 3 ) ) > $length ) {
				break;
			}
			if ( count( $values ) == $num_octets ) {
				for ( $j = 0; $j < $num_octets; $j++ ) {
					$unicode .= '%' . dechex( $values[ $j ] );
				}

				$unicode_length += $num_octets * 3;

				$values     = array();
				$num_octets = 1;
			}
		}
	}

	return $unicode;
}


Top ↑

Changelog

Changelog
VersionDescription
5.8.3Added the encode_ascii_characters parameter.
1.5.0Introduced.

The content displayed on this page has been created in part by processing WordPress source code files which are made available under the GPLv2 (or a later version) license by theĀ Free Software Foundation. In addition to this, the content includes user-written examples and information. All material is subject to review and curation by the WPPaste.com community.

Show More