JVM
utf8.h File Reference
#include <stdint.h>
Include dependency graph for utf8.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Macros

#define PRINT_UTF8(x)   x->Utf8.length, x->Utf8.bytes
 Utility macro to reduce code and improve readability. More...
 
#define UTF8(x)   x->Utf8.bytes, x->Utf8.length
 Utility macro to reduce code and improve readability. More...
 

Functions

uint8_t nextUTF8Character (const uint8_t *utf8_bytes, int32_t utf8_len, uint32_t *outCharacter)
 Function to iterate over the bytes of a UTF-8. More...
 
char cmp_UTF8_Ascii (const uint8_t *utf8_bytes, int32_t utf8_len, const uint8_t *ascii_bytes, int32_t ascii_len)
 Function to compare two strings, one in UTF-8 and other in ASCII. More...
 
char cmp_UTF8 (const uint8_t *utf8A_bytes, int32_t utf8A_len, const uint8_t *utf8B_bytes, int32_t utf8B_len)
 Function to compare two strings, both in UTF-8. More...
 
char cmp_UTF8_FilePath (const uint8_t *utf8A_bytes, int32_t utf8A_len, const uint8_t *utf8B_bytes, int32_t utf8B_len)
 Function to compare two strings that contains file paths, both in UTF-8. More...
 
uint32_t UTF8_to_Ascii (uint8_t *out_buffer, int32_t buffer_len, const uint8_t *utf8_bytes, int32_t utf8_len)
 Function that translates a UTF-8 stream to ASCII. More...
 
uint32_t UTF8StringLength (const uint8_t *utf8_bytes, int32_t utf8_len)
 Returns the number of characters a UTF-8 string has. More...
 

Macro Definition Documentation

§ PRINT_UTF8

#define PRINT_UTF8 (   x)    x->Utf8.length, x->Utf8.bytes

Utility macro to reduce code and improve readability.

Usage example:

printf("%.*s", PRINT_UTF8(my_utf8_var));

Instead of:

printf("%.*s", my_utf8_var->Utf8.length, my_utf8_var->Utf8.bytes);

§ UTF8

#define UTF8 (   x)    x->Utf8.bytes, x->Utf8.length

Utility macro to reduce code and improve readability.

Usage example:

cmp_UTF8(UTF8(varA), UTF8(varB));

Instead of:

cmp_UTF8(varA->Utf8.bytes, varA->Utf8.length, varB->Utf8.bytes, varB->Utf8.length);

Function Documentation

§ cmp_UTF8()

char cmp_UTF8 ( const uint8_t *  utf8A_bytes,
int32_t  utf8A_len,
const uint8_t *  utf8B_bytes,
int32_t  utf8B_len 
)

Function to compare two strings, both in UTF-8.

Parameters
constuint8_t* utf8A_bytes - pointer to the bytes that make the UTF-8 A string to be compared
constuint8_t* utf8B_bytes - pointer to the bytes that make the UTF-8 B string to be compared
int32_tutf8A_len - length of the bytes that make the A string
int32_tutf8B_len - length of the bytes that make the B string
Returns
return value is 1 in case the strings are equal (case sensitive), 0 otherwise.
Here is the caller graph for this function:

§ cmp_UTF8_Ascii()

char cmp_UTF8_Ascii ( const uint8_t *  utf8_bytes,
int32_t  utf8_len,
const uint8_t *  ascii_bytes,
int32_t  ascii_len 
)

Function to compare two strings, one in UTF-8 and other in ASCII.

Parameters
constuint8_t* utf8_bytes - bytes of the UTF-8 string
int32_tutf8_len - length of the UTF-8 string
constuint8_t* ascii_bytes - ASCII string to compare to
int32_tascii_len - length of ASCII string
Returns
Will return 1 if the strings are equal (case sensitive), 0 otherwise.
Note
Doesn't matter if the string is null terminated or not, as long as the length is correct.
Here is the call graph for this function:
Here is the caller graph for this function:

§ cmp_UTF8_FilePath()

char cmp_UTF8_FilePath ( const uint8_t *  utf8A_bytes,
int32_t  utf8A_len,
const uint8_t *  utf8B_bytes,
int32_t  utf8B_len 
)

Function to compare two strings that contains file paths, both in UTF-8.

The difference in this function is that it considers slashes (/) and backslashes () the same characters, and consecutive slashes or backslashes are treated like one character, for sake of checking if two strings are actually the path to the same file/directory. utf8A_bytes and utf8B_bytes are the pointers to the bytes that make the UTF-8 strings A and B that will be compared. utf8A_len and utf8B_len are the length of the bytes that make those strings, respectively.

Parameters
constuint8_t* utf8A_bytes - pointer to the bytes that make the UTF-8 A string to be compared
constuint8_t* utf8B_bytes pointer to the bytes that make the UTF-8 B string to be compared
int32_tutf8A_len - length of the bytes that make the A string
int32_tutf8B_len - length of the bytes that make the B string
Returns
return value is 1 in case the strings are equal (case sensitive), 0 otherwise.
Here is the call graph for this function:
Here is the caller graph for this function:

§ nextUTF8Character()

uint8_t nextUTF8Character ( const uint8_t *  utf8_bytes,
int32_t  utf8_len,
uint32_t *  outCharacter 
)

Function to iterate over the bytes of a UTF-8.

Parameters
constuint8_t* utf8_bytes - represents the character being read
int32_tutf8_len - is the number of characteres
uint32_t*outCharacter - pointer where the character being read is written, if it isn't NULL
Returns
uint8_t - the number of bytes read from the UTF-8 stream to represent that single character. If the return value is 0, then nothing was read. It could mean that the length is not sufficient, that the UTF-8 encoding is wrong or the stream has a four-byte character, which isn't supported by this program.
Here is the caller graph for this function:

§ UTF8_to_Ascii()

uint32_t UTF8_to_Ascii ( uint8_t *  out_buffer,
int32_t  buffer_len,
const uint8_t *  utf8_bytes,
int32_t  utf8_len 
)

Function that translates a UTF-8 stream to ASCII.

Parameters
constuint8_t* utf8_bytes - UTF-8 strem to be translate
int32_tutf8_len - length of the bytes that make the UTF-8 strem
uint8_t*out_buffer - pointer where the result will be stored
int32_tbuffer_len - length of the bytes that make out_buffer
Note
"buffer_len" characters will be written to the buffer, NULL character included.
Here is the call graph for this function:
Here is the caller graph for this function:

§ UTF8StringLength()

uint32_t UTF8StringLength ( const uint8_t *  utf8_bytes,
int32_t  utf8_len 
)

Returns the number of characters a UTF-8 string has.

Parameters
constuint8_t* utf8_bytes - UTF-8 strem
lengthof the bytes that make utf8_bytes
Returns
number of characters a UTF-8 string has
Here is the call graph for this function: