Standard String Functions
=========================
{{#include ../links.md}}
The following standard methods (mostly defined in the [`MoreStringPackage`][built-in packages] but
excluded when using a [raw `Engine`]) operate on [strings] (and possibly characters).
| Function | Parameter(s) | Description |
| -------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `len` method and property | _none_ | returns the number of characters (**not** number of bytes) in the string |
| `bytes` method and property | _none_ | returns the number of bytes making up the UTF-8 string; for strings containing only ASCII characters, this is much faster than `len` |
| `is_empty` method and property | _none_ | returns `true` if the string is empty |
| `to_blob`
(not available under [`no_index`]) | _none_ | converts the string into an UTF-8 encoded byte-stream and returns it as a [BLOB]. |
| `to_chars`
(not available under [`no_index`]) | _none_ | splits the string by individual characters, returning them as an [array] |
| `get` | position, counting from end if < 0 | gets the character at a certain position ([`()`] if the position is not valid) |
| `set` |
- position, counting from end if < 0
- new character
| sets a certain position to a new character (no effect if the position is not valid) |
| `pad` | - target length
- character/string to pad
| pads the string with a character or a string to at least a specified length |
| `append`, `+=` operator | item to append | adds the display text of an item to the end of the string |
| `remove` | character/string to remove | removes a character or a string from the string |
| `pop` | _(optional)_ number of characters to remove, none if ≤ 0, entire string if ≥ length | removes the last character (if no parameter) and returns it ([`()`] if empty); otherwise, removes the last number of characters and returns them as a string |
| `clear` | _none_ | empties the string |
| `truncate` | target length | cuts off the string at exactly a specified number of characters |
| `to_upper` | _none_ | converts the string/character into upper-case as a new string/character and returns it |
| `to_lower` | _none_ | converts the string/character into lower-case as a new string/character and returns it |
| `make_upper` | _none_ | converts the string/character into upper-case |
| `make_lower` | _none_ | converts the string/character into lower-case |
| `trim` | _none_ | trims the string of whitespace at the beginning and end |
| `contains` | character/sub-string to search for | checks if a certain character or sub-string occurs in the string |
| `starts_with` | string | returns `true` if the string starts with a certain string |
| `ends_with` | string | returns `true` if the string ends with a certain string |
| `min` | - first character/string
- second character/string
| returns the smaller of two characters/strings |
| `max` | - first character/string
- second character/string
| returns the larger of two characters/strings |
| `index_of` | - character/sub-string to search for
- _(optional)_ start position, counting from end if < 0, end if ≥ length
| returns the position that a certain character or sub-string occurs in the string, or −1 if not found |
| `sub_string` | - start position, counting from end if < 0
- _(optional)_ number of characters to extract, none if ≤ 0, to end if omitted
| extracts a sub-string |
| `sub_string` | [range] of characters to extract, from beginning if ≤ 0, to end if ≥ length | extracts a sub-string |
| `split`
(not available under [`no_index`]) | _none_ | splits the string by whitespaces, returning an [array] of string segments |
| `split`
(not available under [`no_index`]) | position to split at (in number of characters), counting from end if < 0, end if ≥ length | splits the string into two segments at the specified character position, returning an [array] of two string segments |
| `split`
(not available under [`no_index`]) | - delimiter character/string
- _(optional)_ maximum number of segments, 1 if < 1
| splits the string by the specified delimiter, returning an [array] of string segments |
| `split_rev`
(not available under [`no_index`]) | - delimiter character/string
- _(optional)_ maximum number of segments, 1 if < 1
| splits the string by the specified delimiter in reverse order, returning an [array] of string segments |
| `crop` | - start position, counting from end if < 0
- _(optional)_ number of characters to retain, none if ≤ 0, to end if omitted
| retains only a portion of the string |
| `crop` | [range] of characters to retain, from beginning if ≤ 0, to end if ≥ length | retains only a portion of the string |
| `replace` | - target character/sub-string
- replacement character/string
| replaces a sub-string with another |
| `chars` method and property | - _(optional)_ start position, counting from end if < 0
- _(optional)_ number of characters to iterate, none if ≤ 0
| allows iteration of the characters inside the string |
Beware that functions that involve indexing into a [string] to get at individual [characters],
e.g. `sub_string`, require walking through the entire UTF-8 encoded bytes stream to extract
individual Unicode characters and counting them, which can be slow for long [strings].
Building Strings
----------------
[Strings] can be built from segments via the `+` operator.
| Operator | Description |
| ------------------ | ------------------------------------------------------------------------- |
| [string] `+=` item | convert the item into a [string], then append it to the first [string] |
| [string] `+` item | convert the item into a [string], then concatenate them as a new [string] |
| item `+` [string] | convert the item into a [string], then concatenate them as a new [string] |
```rust
let x = 42;
// Build string with '+'
let s = "The answer is: " + x + "!!!";
// Prints: "The answer is: 42!!!"
print(s);
```
### Standard Operators Between Strings and/or Characters
The following standard operators inter-operate between [strings] and/or [characters][strings].
When one (or both) of the operands is a [character], it is first converted into a one-character
[string] before running the operator.
| Operator | Description |
| --------- | --------------------------------------------- |
| `+`, `+=` | [character]/[string] concatenation |
| `-`, `-=` | remove [character]/sub-[string] from [string] |
| `==` | equals to |
| `!=` | not equals to |
| `>` | greater than |
| `>=` | greater than or equals to |
| `<` | less than |
| `<=` | less than or equals to |
### Interop with BLOB's
For convenience, when a [BLOB] is appended to a [string], or vice versa, it is treated as a UTF-8
encoded byte stream and automatically first converted into the appropriate [string] value.
That is because it is rarely useful to append a [BLOB] into a string, but extremely useful to be
able to directly manipulate UTF-8 encoded text.
| Operator | Description |
| --------- | --------------------------------------------------------------------------- |
| `+`, `+=` | append a [BLOB] (as a UTF-8 encoded byte stream) to the end of the [string] |
| `+` | concatenate a [BLOB] (as a UTF-8 encoded byte stream) with a [string] |
Examples
--------
```rust
let full_name == " Bob C. Davis ";
full_name.len == 14;
full_name.trim();
full_name.len == 12;
full_name == "Bob C. Davis";
full_name.pad(15, '$');
full_name.len == 15;
full_name == "Bob C. Davis$$$";
let n = full_name.index_of('$');
n == 12;
full_name.index_of("$$", n + 1) == 13;
full_name.sub_string(n, 3) == "$$$";
full_name.sub_string(n..n+3) == "$$$";
full_name.truncate(6);
full_name.len == 6;
full_name == "Bob C.";
full_name.replace("Bob", "John");
full_name.len == 7;
full_name == "John C.";
full_name.contains('C') == true;
full_name.contains("John") == true;
full_name.crop(5);
full_name == "C.";
full_name.crop(0, 1);
full_name == "C";
full_name.clear();
full_name.len == 0;
```