This repository has been archived on 2025-08-04. You can view files and clone it, but cannot push or open issues or pull requests.
rhaj/_archive/rhai_engine/rhaibook/language/string-fn.md
2025-04-04 08:28:07 +02:00

148 lines
16 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

Standard String Functions
=========================
{{#include ../links.md}}
The following standard methods (mostly defined in the [`MoreStringPackage`][built-in packages] but
excluded when using a [raw `Engine`]) operate on [strings] (and possibly characters).
| Function | Parameter(s) | Description |
| -------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `len` method and property | _none_ | returns the number of characters (**not** number of bytes) in the string |
| `bytes` method and property | _none_ | returns the number of bytes making up the UTF-8 string; for strings containing only ASCII characters, this is much faster than `len` |
| `is_empty` method and property | _none_ | returns `true` if the string is empty |
| `to_blob`<br/>(not available under [`no_index`]) | _none_ | converts the string into an UTF-8 encoded byte-stream and returns it as a [BLOB]. |
| `to_chars`<br/>(not available under [`no_index`]) | _none_ | splits the string by individual characters, returning them as an [array] |
| `get` | position, counting from end if < 0 | gets the character at a certain position ([`()`] if the position is not valid) |
| `set` | <ol><li>position, counting from end if < 0</li><li>new character</li></ol> | sets a certain position to a new character (no effect if the position is not valid) |
| `pad` | <ol><li>target length</li><li>character/string to pad</li></ol> | pads the string with a character or a string to at least a specified length |
| `append`, `+=` operator | item to append | adds the display text of an item to the end of the string |
| `remove` | character/string to remove | removes a character or a string from the string |
| `pop` | _(optional)_ number of characters to remove, none if ≤ 0, entire string if ≥ length | removes the last character (if no parameter) and returns it ([`()`] if empty); otherwise, removes the last number of characters and returns them as a string |
| `clear` | _none_ | empties the string |
| `truncate` | target length | cuts off the string at exactly a specified number of characters |
| `to_upper` | _none_ | converts the string/character into upper-case as a new string/character and returns it |
| `to_lower` | _none_ | converts the string/character into lower-case as a new string/character and returns it |
| `make_upper` | _none_ | converts the string/character into upper-case |
| `make_lower` | _none_ | converts the string/character into lower-case |
| `trim` | _none_ | trims the string of whitespace at the beginning and end |
| `contains` | character/sub-string to search for | checks if a certain character or sub-string occurs in the string |
| `starts_with` | string | returns `true` if the string starts with a certain string |
| `ends_with` | string | returns `true` if the string ends with a certain string |
| `min` | <ol><li>first character/string</li><li>second character/string</li><ol> | returns the smaller of two characters/strings |
| `max` | <ol><li>first character/string</li><li>second character/string</li><ol> | returns the larger of two characters/strings |
| `index_of` | <ol><li>character/sub-string to search for</li><li>_(optional)_ start position, counting from end if < 0, end if length</li></ol> | returns the position that a certain character or sub-string occurs in the string, or 1 if not found |
| `sub_string` | <ol><li>start position, counting from end if < 0</li><li>_(optional)_ number of characters to extract, none if ≤ 0, to end if omitted</li></ol> | extracts a sub-string |
| `sub_string` | [range] of characters to extract, from beginning if ≤ 0, to end if ≥ length | extracts a sub-string |
| `split`<br/>(not available under [`no_index`]) | _none_ | splits the string by whitespaces, returning an [array] of string segments |
| `split`<br/>(not available under [`no_index`]) | position to split at (in number of characters), counting from end if < 0, end if length | splits the string into two segments at the specified character position, returning an [array] of two string segments |
| `split`<br/>(not available under [`no_index`]) | <ol><li>delimiter character/string</li><li>_(optional)_ maximum number of segments, 1 if < 1</li></ol> | splits the string by the specified delimiter, returning an [array] of string segments |
| `split_rev`<br/>(not available under [`no_index`]) | <ol><li>delimiter character/string</li><li>_(optional)_ maximum number of segments, 1 if < 1</li></ol> | splits the string by the specified delimiter in reverse order, returning an [array] of string segments |
| `crop` | <ol><li>start position, counting from end if < 0</li><li>_(optional)_ number of characters to retain, none if ≤ 0, to end if omitted</li></ol> | retains only a portion of the string |
| `crop` | [range] of characters to retain, from beginning if ≤ 0, to end if ≥ length | retains only a portion of the string |
| `replace` | <ol><li>target character/sub-string</li><li>replacement character/string</li></ol> | replaces a sub-string with another |
| `chars` method and property | <ol><li>_(optional)_ start position, counting from end if < 0</li><li>_(optional)_ number of characters to iterate, none if ≤ 0</li></ol> | allows iteration of the characters inside the string |
Beware that functions that involve indexing into a [string] to get at individual [characters],
e.g. `sub_string`, require walking through the entire UTF-8 encoded bytes stream to extract
individual Unicode characters and counting them, which can be slow for long [strings].
Building Strings
----------------
[Strings] can be built from segments via the `+` operator.
| Operator | Description |
| ------------------ | ------------------------------------------------------------------------- |
| [string] `+=` item | convert the item into a [string], then append it to the first [string] |
| [string] `+` item | convert the item into a [string], then concatenate them as a new [string] |
| item `+` [string] | convert the item into a [string], then concatenate them as a new [string] |
```rust
let x = 42;
// Build string with '+'
let s = "The answer is: " + x + "!!!";
// Prints: "The answer is: 42!!!"
print(s);
```
### Standard Operators Between Strings and/or Characters
The following standard operators inter-operate between [strings] and/or [characters][strings].
When one (or both) of the operands is a [character], it is first converted into a one-character
[string] before running the operator.
| Operator | Description |
| --------- | --------------------------------------------- |
| `+`, `+=` | [character]/[string] concatenation |
| `-`, `-=` | remove [character]/sub-[string] from [string] |
| `==` | equals to |
| `!=` | not equals to |
| `>` | greater than |
| `>=` | greater than or equals to |
| `<` | less than |
| `<=` | less than or equals to |
### Interop with BLOB's
For convenience, when a [BLOB] is appended to a [string], or vice versa, it is treated as a UTF-8
encoded byte stream and automatically first converted into the appropriate [string] value.
That is because it is rarely useful to append a [BLOB] into a string, but extremely useful to be
able to directly manipulate UTF-8 encoded text.
| Operator | Description |
| --------- | --------------------------------------------------------------------------- |
| `+`, `+=` | append a [BLOB] (as a UTF-8 encoded byte stream) to the end of the [string] |
| `+` | concatenate a [BLOB] (as a UTF-8 encoded byte stream) with a [string] |
Examples
--------
```rust
let full_name == " Bob C. Davis ";
full_name.len == 14;
full_name.trim();
full_name.len == 12;
full_name == "Bob C. Davis";
full_name.pad(15, '$');
full_name.len == 15;
full_name == "Bob C. Davis$$$";
let n = full_name.index_of('$');
n == 12;
full_name.index_of("$$", n + 1) == 13;
full_name.sub_string(n, 3) == "$$$";
full_name.sub_string(n..n+3) == "$$$";
full_name.truncate(6);
full_name.len == 6;
full_name == "Bob C.";
full_name.replace("Bob", "John");
full_name.len == 7;
full_name == "John C.";
full_name.contains('C') == true;
full_name.contains("John") == true;
full_name.crop(5);
full_name == "C.";
full_name.crop(0, 1);
full_name == "C";
full_name.clear();
full_name.len == 0;
```