You are here

function truncate_utf8 in Drupal 4

Same name and namespace in other branches
  1. 5 includes/unicode.inc \truncate_utf8()
  2. 6 includes/unicode.inc \truncate_utf8()
  3. 7 includes/unicode.inc \truncate_utf8()

Truncate a UTF-8-encoded string safely to a number of bytes.

If the end position is in the middle of a UTF-8 sequence, it scans backwards until the beginning of the byte sequence.

Use this function whenever you want to chop off a string at an unsure location. On the other hand, if you're sure that you're splitting on a character boundary (e.g. after using strpos() or similar), you can safely use substr() instead.

Parameters

$string: The string to truncate.

$len: An upper limit on the returned string length.

$wordsafe: Flag to truncate at nearest space. Defaults to FALSE.

Return value

The truncated string.

10 calls to truncate_utf8()
aggregator_parse_feed in modules/aggregator.module
comment_admin_overview in modules/comment.module
Menu callback; present an administrative comment listing.
mime_header_encode in includes/unicode.inc
Encodes MIME/HTTP header values that contain non-ASCII, UTF-8 encoded characters.
node_teaser in modules/node.module
Automatically generate a teaser for a node body in a given format.
search_excerpt in modules/search.module
Returns snippets from a piece of text, with certain keywords highlighted. Used for formatting search results.

... See full list

File

includes/unicode.inc, line 190

Code

function truncate_utf8($string, $len, $wordsafe = FALSE, $dots = FALSE) {
  $slen = strlen($string);
  if ($slen <= $len) {
    return $string;
  }
  if ($wordsafe) {
    $end = $len;
    while ($string[--$len] != ' ' && $len > 0) {
    }
    if ($len == 0) {
      $len = $end;
    }
  }
  if (ord($string[$len]) < 0x80 || ord($string[$len]) >= 0xc0) {
    return substr($string, 0, $len) . ($dots ? ' ...' : '');
  }
  while (--$len >= 0 && ord($string[$len]) >= 0x80 && ord($string[$len]) < 0xc0) {
  }
  return substr($string, 0, $len) . ($dots ? ' ...' : '');
}