Table Of Contents

Previous topic

Strings

Next topic

String Functions

Character Functions

UTF-8 aware character functions.

Summary

s_getc Read the first character from a UTF-8 string.
s_getx Read the first character from a UTF-8 string, advancing the pointer position.
s_getat Return the character from the specified index within the UTF-8 string.
s_setc Set a character in a UTF-8 string and return the byte size of the character.
s_setat Modify the character at the specified index within the UTF-8 string, handling adjustments for variable width data.
s_width Return the width (number of bytes) of the first UTF-8 character in the given string.
s_cwidth Return the width (number of bytes) of a UTF-8 character represented by the given 4-byte unsigned integer.
s_offset Return the offset in bytes from the start of the UTF-8 string to the character at the specified index.
s_insert Insert a character at the specified index within a UTF-8 string, sliding following data along to make room.
s_remove Remove the character at the specified index within the UTF-8 string, sliding following data back to make room.
s_tolower Convert the given UTF-8 character to lower case.
s_toupper Convert the given UTF-8 character to upper case.
s_isspace Check whether the given UTF-8 character is a white-space.
s_isdigit Check whether the given UTF-8 character is a digit.

Getters

uint32 s_getc(const char *s, s_erc *error)

Read the first character from a UTF-8 string.

Parameters:
  • s

    String to read character from.

  • error

    Error code.

Return:

UTF8 character represented as a 4-byte unsigned integer.

uint32 s_getx(char **s, s_erc *error)

Read the first character from a UTF-8 string, advancing the pointer position.

Parameters:
  • s

    String to read character from.

  • error

    Error code.

Return:

UTF8 character represented as a 4-byte unsigned integer.

Note:

The original pointer is lost if it is not saved somewhere else.

uint32 s_getat(const char *s, uint index, s_erc *error)

Return the character from the specified index within the UTF-8 string.

Indexing in strings start at 0.

Parameters:
  • s

    The string.

  • index

    The index of the character to return.

  • error

    Error code.

Return:

The UTF-8 character at index in s.

Setters

size_t s_setc(char *s, uint32 c, s_erc *error)

Set a character in a UTF-8 string and return the byte size of the character.

Parameters:
  • s

    The string.

  • c

    The UTF-8 character.

  • error

    Error code.

Return:

The byte size of the character.

size_t s_setat(char *s, uint index, uint32 c, s_erc *error)

Modify the character at the specified index within the UTF-8 string, handling adjustments for variable width data.

Returns how far the rest of the string was moved in bytes. Indexing in strings start at 0.

Parameters:
  • s

    The string.

  • index

    The index of the character to modify.

  • c

    The new character.

  • error

    Error code.

Return:

Number of bytes the string was moved.

Width

size_t s_width(const char *s, s_erc *error)

Return the width (number of bytes) of the first UTF-8 character in the given string.

Parameters:
  • s

    The string.

  • error

    Error code.

Return:

The byte width of the first character in the string.

size_t s_cwidth(uint32 c, s_erc *error)

Return the width (number of bytes) of a UTF-8 character represented by the given 4-byte unsigned integer.

Parameters:
  • c

    The UTF-8 character.

  • error

    Error code.

Return:

The byte width of the character.

Offset

size_t s_offset(const char *s, int index, s_erc *error)

Return the offset in bytes from the start of the UTF-8 string to the character at the specified index.

If the index is negative, counts backward from the end of the string (-1 returns an offset to the last character).

Parameters:
  • s

    The string.

  • index

    The index of the character.

  • error

    Error code.

Return:

The byte offset of the character at index in s.

Insert/Remove

size_t s_insert(char *s, uint index, uint32 c, s_erc *error)

Insert a character at the specified index within a UTF-8 string, sliding following data along to make room.

Returns how far the data was moved in bytes.

Parameters:
  • s

    The string. Indexing in strings start at 0.

  • index

    The index where to insert the new character.

  • c

    The new character.

  • error

    Error code.

Return:

The amount of bytes the string was moved.

Note:

The given string s must be big enough to take inserted character, no checking is done.

size_t s_remove(char *s, uint index, s_erc *error)

Remove the character at the specified index within the UTF-8 string, sliding following data back to make room.

Returns how far the data was moved in bytes.

Parameters:
  • s

    The string. Indexing in strings start at 0.

  • index

    The index of the character to remove.

  • error

    Error code.

Return:

The amount of bytes the string was moved.

Case conversion

uint32 s_tolower(uint32 c, s_erc *error)

Convert the given UTF-8 character to lower case.

Parameters:
  • c

    The character to convert.

  • error

    Error code.

Return:

The lower case of c.

uint32 s_toupper(uint32 c, s_erc *error)

Convert the given UTF-8 character to upper case.

Parameters:
  • c

    The character to convert.

  • error

    Error code.

Return:

The upper case of c.

Character type

int s_isspace(uint32 c, s_erc *error)

Check whether the given UTF-8 character is a white-space.

White-space is space (‘ ‘), form-feed (‘\f’), newline (‘\n’), carriage return (‘\r’), horizontal tab (‘\t’), and vertical tab (‘\v’).

Parameters:
  • c

    The UTF-8 character to test.

  • error

    Error code.

Return:

Non-zero if c is a white-space otherwise 0.

int s_isdigit(uint32 c, s_erc *error)

Check whether the given UTF-8 character is a digit.

Parameters:
  • c

    The UTF-8 character to test.

  • error

    Error code.

Return:

Non-zero if c is a digit otherwise 0.