12.8. 文字列（Strings）🔗

文字列は Unicode テキストを表現しています。文字列は Lean で特別にサポートされています：

文字のリストという観点から動作を指定する 論理モデル を持っており、文字列に対する各操作の意味を指定します。
コンパイルされたコードにおいて文字列を UTF-8 としてエンコードするバイトの packed array として最適化されたランタイム表現を持っており、Lean ランタイムは特別に文字列操作を最適化します。
文字列を書くための文字列リテラルの構文があります。

文字列が内部的に UTF-8 エンコードされたバイト配列として表現されていることは API で確認できます：

文字列から特定の文字を射影する操作はありません（仮にあった場合パフォーマンスの罠になりえます）。 Nat の代わりに String.Iterator をループ内で使ってください。
文字列は String.Pos によってインデックス付けされ、内部的には 文字数 ではなく バイト数 を記録するため、一定の時間がかかります。0 は別として、これらは直接構成されるのではなく、 String.next と String.prev をつかって更新されます。

12.8.1. 論理モデル（Logical Model）

structure

String : Type

String is the type of (UTF-8 encoded) strings.

The compiler overrides the data representation of this type to a byte sequence, and both String.utf8ByteSize and String.length are cached and O(1).

Constructor

String.mk

Pack a List Char into a String. This function is overridden by the compiler and is O(n) in the length of the list.

Fields

data : List Char

Unpack String into a List Char. This function is overridden by the compiler and is O(n) in the length of the list.

Lean における文字列の論理モデルは、文字のリストである1つのフィールドを含む構造体です。これは低レベルで文字列処理関数のプロパティを指定したり証明したりするときに便利です。

12.8.2. ランタイム表現（Run-Time Representation）🔗

文字列はバイトの動的配列として表現され、UTF-8 でエンコードされます。オブジェクトヘッダの後、文字列は以下を含みます：

バイト数: 現在有効な文字列データを含むバイト数

容量: 現在文字列に割り当てられているバイト数

長さ: エンコードされた文字列の長さ。UTF-8 のマルチバイト文字のため、バイト数よりも短くなる可能性があります。

データ: ヌル文字で終端された文字列内の実際の文字データ

Lean ランタイムの多くの文字列関数は、おぷじぇくとヘッダの参照カウントを参照することで引数に排他的アクセスがあるかどうかをチェックします。もしアクセスがあり、かつ文字列の容量が十分であれば、新しいメモリを確保するのではなく、既存の文字列を変更することができます。そうでない場合は新しい文字列を割り当てなければなりません。

12.8.2.1. パフォーマンスについての注記（Performance Notes）🔗

一見普通のコンストラクタと射影に見えますが、 String.mk と String.data は 文字列の長さに比例した時間 がかかります。これは文字のリストとバイトの packed array の間の変換を実装しなければならないためで、各文字を必ず訪問しなければなりません。

12.8.3. 構文（Syntax）🔗

Lean には3種類の文字列リテラルがあります：通常の文字列リテラル・補間文字列リテラル・生文字列リテラル

12.8.3.1. 文字列リテラル（String Literals）🔗

文字列リテラルはダブルクォート文字 " で開始・終了します。これらの文字の間には、改行文字を（文字通り）含む他の任意の文字を含めることができます（ただし、Lean ソースファイル内のすべての改行文字はファイルのエンコーディングやプラットフォームに関係なく '\n' として解釈されます）。文字列リテラルにかけない特殊文字はバックスラッシュでエスケープすることができるため、 "\"Quotes\"" は二重引用符で開始・終了する文字列リテラルです。以下の形式のエスケープシーケンスが利用できます：

\r・\n・\t・\\・\"・\': これらのエスケープシーケンスは通常の意味を持ち、それぞれ CR・LF・タブ・バックスラッシュ・二重引用符・引用符に対応します。

\xNN: NN が2桁の16進数である時、このエスケープは2桁の16進数で示される Unicode コードポイントを持つ文字を表します。

\uNNNN: NN が2桁の16進数である時、このエスケープは4桁の16進数コードで示される Unicode コードポイントを持つ文字を表します。

文字列リテラルは ギャップ （gap）を含むことができます。ギャップはエスケープされた開業で示され、エスケープされたバックスラッシュと改行の間には文字が介在しません。この場合、リテラルで示される文字列には、改行と次の行の先頭の空白がすべて含まれません。文字列のギャップは空白のみを含む行の前にあってはなりません。

def str1 := "String with \
             a gap"
def str2 := "String with a gap"

example : str1 = str2 := rfl

ギャップに続く行が空の場合、その文字列は拒否されます：

def str3 := "String with \unexpected additional newline in string gap             a gap"

パーサは以下のエラーを返します：

<example>:2:0: unexpected additional newline in string gap

12.8.3.2. 補間文字列（Interpolated Strings）🔗

文字列の前に s! を置くと、 補間文字列 （interpolated string）として処理されます。この文字列内の { と } で囲まれた領域がパースされ、Lean の式として解釈されます。補間文字列は、補間の前の文字列・式（それを囲む toString の呼び出しが追加されたもの）・補間後の文字列が連結されたものとして解釈されます。

例えば：

example : s!"1 + 1 = {1 + 1}\n" = "1 + 1 = " ++ toString (1 + 1) ++ "\n" := rfl

リテラルの前に m! を置くと、 MessageData のインスタンスとなります。これはコンパイラの内部からユーザ向けに使用されるメッセージのデータ構造です。

12.8.3.3. 生文字列リテラル🔗

生文字列リテラル（raw string literal）では、エスケープシーケンスやギャップはなく、各文字はそれ自身を正確に表します。生文字列リテラルは r で始まり、0個以上のハッシュ文字（#）と二重引用符 " が続きます。文字列リテラルは 同じ数 のハッシュ文字が続く二重引用符で補完されます。例えば、特定の文字をダブルエスケープする必要を避けるために使用することができます：

example : r"\t" = "\\t" := rfl
"Write backslash in a string using '\\\\\\\\'"#eval r"Write backslash in a string using '\\\\'"

この #eval は以下を出力します：

"Write backslash in a string using '\\\\\\\\'"

ハッシュ記号を含めると、文字列にエスケープされていない引用符を含めることができます。

example : r#"This is "literally" quoted"# = "This is \"literally\" quoted" := rfl

十分な数のハッシュ記号を追加することで、どんな生リテラルでも文字通りに書くことができます：

example : r##"This is r#"literally"# quoted"## = "This is r#\"literally\"# quoted" := rfl

12.8.4. API リファレンス（API Reference）🔗

12.8.4.1. 構成（Constructing）🔗

def

String.singleton (c : Char) : String

def

String.append : String → String → String

Appends two strings.

The internal implementation uses dynamic arrays and will perform destructive updates if the string is not shared.

Example: "abc".append "def" = "abcdef"

def

String.join (l : List String) : String

def

String.intercalate (s : String)
    : List String → String

12.8.4.2. 変換（Conversions）🔗

def

String.toList (s : String) : List Char

Converts a string to a list of characters.

Even though the logical model of strings is as a structure that wraps a list of characters, this operation takes time and space linear in the length of the string, because the compiler uses an optimized representation as dynamic arrays.

Example: "abc".toList = ['a', 'b', 'c']

def

String.isNat (s : String) : Bool

def

String.toNat? (s : String) : Option Nat

def

String.toNat! (s : String) : Nat

Interpret the string as the decimal representation of a natural number.

Panics if the string is not a string of digits.

def

String.isInt (s : String) : Bool

def

String.toInt? (s : String) : Option Int

def

String.toInt! (s : String) : Int

def

String.toFormat (s : String) : Std.Format

12.8.4.3. プロパティ（Properties）🔗

def

String.isEmpty (s : String) : Bool

def

String.length : String → Nat

Returns the length of a string in Unicode code points.

Examples:

"".length = 0
"abc".length = 3
"L∃∀N".length = 4

12.8.4.4. 位置（Positions）🔗

structure

String.Pos : Type

A byte position in a String. Internally, Strings are UTF-8 encoded. Codepoint positions (counting the Unicode codepoints rather than bytes) are represented by plain Nats instead. Indexing a String by a byte position is constant-time, while codepoint positions need to be translated internally to byte positions in linear-time.

A byte position p is valid for a string s if 0 ≤ p ≤ s.endPos and p lies on a UTF8 byte boundary.

Constructor

String.Pos.mk

Fields

byteIdx : Nat

Get the underlying byte index of a String.Pos

def

String.Pos.isValid (s : String) (p : String.Pos)
    : Bool

Returns true if p is a valid UTF-8 position in the string s, meaning that p ≤ s.endPos and p lies on a UTF-8 character boundary. This has an O(1) implementation in the runtime.

def

String.atEnd : String → String.Pos → Bool

Returns true if a specified position is greater than or equal to the position which points to the end of a string. Otherwise, returns false.

Examples: Given def abc := "abc" and def lean := "L∃∀N",

(0 |> abc.next |> abc.next |> abc.atEnd) = false
(0 |> abc.next |> abc.next |> abc.next |> abc.next |> abc.atEnd) = true
(0 |> lean.next |> lean.next |> lean.next |> lean.next |> lean.atEnd) = true

Because "L∃∀N" contains multi-byte characters, lean.next (lean.next 0) is not equal to abc.next (abc.next 0).

def

String.endPos (s : String) : String.Pos

A String.Pos pointing at the end of this string.

def

String.next (s : String) (p : String.Pos)
    : String.Pos

Returns the next position in a string after position p. If p is not a valid position or p = s.endPos, the result is unspecified.

Examples: Given def abc := "abc" and def lean := "L∃∀N",

abc.get (0 |> abc.next) = 'b'
lean.get (0 |> lean.next |> lean.next) = '∀'

Cases where the result is unspecified:

"abc".next ⟨3⟩, since 3 = s.endPos
"L∃∀N".next ⟨2⟩, since 2 points into the middle of a multi-byte UTF-8 character

def

String.next' (s : String) (p : String.Pos)
    (h : ¬s.atEnd p = true) : String.Pos

Returns the next position in a string after position p. If p is not a valid position, the result is unspecified.

Requires evidence, h, that p is within bounds instead of performing a runtime bounds check as in next.

Examples:

let abc := "abc"; abc.get (abc.next' 0 (by decide)) = 'b'

A typical pattern combines next' with a dependent if-else expression to avoid the overhead of an additional bounds check. For example:

def next? (s: String) (p : String.Pos) : Option Char :=
  if h : s.atEnd p then none else s.get (s.next' p h)

def

String.nextWhile (s : String) (p : Char → Bool)
    (i : String.Pos) : String.Pos

def

String.nextUntil (s : String) (p : Char → Bool)
    (i : String.Pos) : String.Pos

def

String.prev : String → String.Pos → String.Pos

Returns the position in a string before a specified position, p. If p = ⟨0⟩, returns 0. If p is not a valid position, the result is unspecified.

Examples: Given def abc := "abc" and def lean := "L∃∀N",

abc.get (abc.endPos |> abc.prev) = 'c'
lean.get (lean.endPos |> lean.prev |> lean.prev |> lean.prev) = '∃'
"L∃∀N".prev ⟨3⟩ is unspecified, since byte 3 occurs in the middle of the multi-byte character '∃'.

def

String.Pos.min (p₁ p₂ : String.Pos) : String.Pos

12.8.4.5. 検索と変更（Lookups and Modifications）🔗

def

String.get (s : String) (p : String.Pos) : Char

Returns the character at position p of a string. If p is not a valid position, returns (default : Char).

See utf8GetAux for the reference implementation.

Examples:

"abc".get ⟨1⟩ = 'b'
"abc".get ⟨3⟩ = (default : Char) = 'A'

Positions can also be invalid if a byte index points into the middle of a multi-byte UTF-8 character. For example,"L∃∀N".get ⟨2⟩ = (default : Char) = 'A'.

def

String.get? : String → String.Pos → Option Char

Returns the character at position p. If p is not a valid position, returns none.

Examples:

"abc".get? ⟨1⟩ = some 'b'
"abc".get? ⟨3⟩ = none

Positions can also be invalid if a byte index points into the middle of a multi-byte UTF-8 character. For example, "L∃∀N".get? ⟨2⟩ = none

def

String.get! (s : String) (p : String.Pos) : Char

Returns the character at position p of a string. If p is not a valid position, returns (default : Char) and produces a panic error message.

Examples:

"abc".get! ⟨1⟩ = 'b'
"abc".get! ⟨3⟩ panics

Positions can also be invalid if a byte index points into the middle of a multi-byte UTF-8 character. For example, "L∃∀N".get! ⟨2⟩ panics.

def

String.get' (s : String) (p : String.Pos)
    (h : ¬s.atEnd p = true) : Char

Returns the character at position p of a string. If p is not a valid position, returns (default : Char).

Requires evidence, h, that p is within bounds instead of performing a runtime bounds check as in get.

Examples:

"abc".get' 0 (by decide) = 'a'
let lean := "L∃∀N"; lean.get' (0 |> lean.next |> lean.next) (by decide) = '∀'

A typical pattern combines get' with a dependent if-else expression to avoid the overhead of an additional bounds check. For example:

def getInBounds? (s : String) (p : String.Pos) : Option Char :=
  if h : s.atEnd p then none else some (s.get' p h)

Even with evidence of ¬ s.atEnd p, p may be invalid if a byte index points into the middle of a multi-byte UTF-8 character. For example, "L∃∀N".get' ⟨2⟩ (by decide) = (default : Char).

def

String.extract
    : String → String.Pos → String.Pos → String

def

String.take (s : String) (n : Nat) : String

def

String.takeWhile (s : String) (p : Char → Bool)
    : String

def

String.takeRight (s : String) (n : Nat) : String

def

String.takeRightWhile (s : String)
    (p : Char → Bool) : String

def

String.drop (s : String) (n : Nat) : String

def

String.dropWhile (s : String) (p : Char → Bool)
    : String

def

String.dropRight (s : String) (n : Nat) : String

def

String.dropRightWhile (s : String)
    (p : Char → Bool) : String

def

String.trim (s : String) : String

def

String.trimLeft (s : String) : String

def

String.trimRight (s : String) : String

def

String.removeLeadingSpaces (s : String) : String

def

String.set : String → String.Pos → Char → String

Replaces the character at a specified position in a string with a new character. If the position is invalid, the string is returned unchanged.

If both the replacement character and the replaced character are ASCII characters and the string is not shared, destructive updates are used.

Examples:

"abc".set ⟨1⟩ 'B' = "aBc"
"abc".set ⟨3⟩ 'D' = "abc"
"L∃∀N".set ⟨4⟩ 'X' = "L∃XN"

Because '∃' is a multi-byte character, the byte index 2 in L∃∀N is an invalid position, so "L∃∀N".set ⟨2⟩ 'X' = "L∃∀N".

def

String.modify (s : String) (i : String.Pos)
    (f : Char → Char) : String

Replaces the character at position p in the string s with the result of applying f to that character. If p is an invalid position, the string is returned unchanged.

Examples:

abc.modify ⟨1⟩ Char.toUpper = "aBc"
abc.modify ⟨3⟩ Char.toUpper = "abc"

def

String.front (s : String) : Char

Returns the first character in s. If s = "", returns (default : Char).

Examples:

"abc".front = 'a'
"".front = (default : Char)

def

String.back (s : String) : Char

Returns the last character in s. If s = "", returns (default : Char).

Examples:

"abc".back = 'c'
"".back = (default : Char)

def

String.posOf (s : String) (c : Char)
    : String.Pos

Returns the position of the first occurrence of a character, c, in s. If s does not contain c, returns s.endPos.

Examples:

"abba".posOf 'a' = ⟨0⟩
"abba".posOf 'z' = ⟨4⟩
"L∃∀N".posOf '∀' = ⟨4⟩

def

String.revPosOf (s : String) (c : Char)
    : Option String.Pos

Returns the position of the last occurrence of a character, c, in s. If s does not contain c, returns none.

Examples:

"abba".posOf 'a' = some ⟨3⟩
"abba".posOf 'z' = none
"L∃∀N".posOf '∀' = some ⟨4⟩

def

String.contains (s : String) (c : Char) : Bool

def

String.offsetOfPos (s : String)
    (pos : String.Pos) : Nat

def

String.replace (s pattern replacement : String)
    : String

Replace all occurrences of pattern in s with replacement.

def

String.findLineStart (s : String)
    (pos : String.Pos) : String.Pos

Return the beginning of the line that contains character pos.

def

String.find (s : String) (p : Char → Bool)
    : String.Pos

def

String.revFind (s : String) (p : Char → Bool)
    : Option String.Pos

12.8.4.6. 畳み込みと集約（Folds and Aggregation）🔗

def

String.map (f : Char → Char) (s : String)
    : String

def

String.foldl.{u} {α : Type u} (f : α → Char → α)
    (init : α) (s : String) : α

def

String.foldr.{u} {α : Type u} (f : Char → α → α)
    (init : α) (s : String) : α

def

String.all (s : String) (p : Char → Bool) : Bool

def

String.any (s : String) (p : Char → Bool) : Bool

12.8.4.7. 比較（Comparisons）🔗

def

String.le (a b : String) : Prop

def

String.firstDiffPos (a b : String) : String.Pos

Returns the first position where the two strings differ.

def

String.substrEq (s1 : String)
    (off1 : String.Pos) (s2 : String)
    (off2 : String.Pos) (sz : Nat) : Bool

Return true iff the substring of byte size sz starting at position off1 in s1 is equal to that starting at off2 in s2.. False if either substring of that byte size does not exist.

def

String.isPrefixOf (p s : String) : Bool

Return true iff p is a prefix of s

def

String.startsWith (s pre : String) : Bool

def

String.endsWith (s post : String) : Bool

def

String.decEq (s₁ s₂ : String)
    : Decidable (s₁ = s₂)

Decides equality on String. This function is overridden with a native implementation.

opaque

String.hash (s : String) : UInt64

An opaque string hash function.

12.8.4.8. 操作（Manipulation）🔗

def

String.split (s : String) (p : Char → Bool)
    : List String

def

String.splitOn (s : String)
  (sep : String := " ") : List String

Splits a string s on occurrences of the separator sep. When sep is empty, it returns [s]; when sep occurs in overlapping patterns, the first match is taken. There will always be exactly n+1 elements in the returned list if there were n nonoverlapping matches of sep in the string. The default separator is " ". The separators are not included in the returned substrings.

"here is some text ".splitOn = ["here", "is", "some", "text", ""]
"here is some text ".splitOn "some" = ["here is ", " text "]
"here is some text ".splitOn "" = ["here is some text "]
"ababacabac".splitOn "aba" = ["", "bac", "c"]

def

String.push : String → Char → String

Pushes a character onto the end of a string.

The internal implementation uses dynamic arrays and will perform destructive updates if the string is not shared.

Example: "abc".push 'd' = "abcd"

def

String.pushn (s : String) (c : Char) (n : Nat)
    : String

def

String.capitalize (s : String) : String

def

String.decapitalize (s : String) : String

def

String.toUpper (s : String) : String

def

String.toLower (s : String) : String

12.8.4.9. イテレータ（Iterators）🔗

基本的に、 String.Iterator は文字列と文字列内の有効な位置のペアです。イテレータは現在の文字を取得する関数（ curr ）・現在の文字を置き換える関数（ setCurr ）・イテレータが左または右に移動できるかどうかチェックする関数（それぞれ hasPrev と hasNext ）・イテレータの移動（それぞれ prev と next ）です。クライアントは文字列の先頭または末尾に到達したかどうかをチェックする責任があります；それ以外の場合、イテレータはその位置が常に文字を指すようにします。

structure

String.Iterator : Type

Iterator over the characters (Char) of a String.

Typically created by s.iter, where s is a String.

An iterator is valid if the position i is valid for the string s, meaning 0 ≤ i ≤ s.endPos and i lies on a UTF8 byte boundary. If i = s.endPos, the iterator is at the end of the string.

Most operations on iterators return arbitrary values if the iterator is not valid. The functions in the String.Iterator API should rule out the creation of invalid iterators, with two exceptions:

Iterator.next iter is invalid if iter is already at the end of the string (iter.atEnd is true), and
Iterator.forward iter n/Iterator.nextn iter n is invalid if n is strictly greater than the number of remaining characters.

Constructor

String.Iterator.mk

Fields

s : String

The string the iterator is for.

i : String.Pos

The current position.

This position is not necessarily valid for the string, for instance if one keeps calling Iterator.next when Iterator.atEnd is true. If the position is not valid, then the current character is (default : Char), similar to String.get on an invalid position.

def

String.iter (s : String) : String.Iterator

Creates an iterator at the beginning of a string.

def

String.mkIterator (s : String) : String.Iterator

Creates an iterator at the beginning of a string.

def

String.Iterator.curr : String.Iterator → Char

The character at the current position.

On an invalid position, returns (default : Char).

def

String.Iterator.hasNext : String.Iterator → Bool

True if the iterator is not past the string's last character.

def

String.Iterator.next
    : String.Iterator → String.Iterator

Moves the iterator's position forward by one character, unconditionally.

It is only valid to call this function if the iterator is not at the end of the string, i.e. Iterator.atEnd is false; otherwise, the resulting iterator will be invalid.

def

String.Iterator.forward
    : String.Iterator → Nat → String.Iterator

Moves the iterator's position several characters forward.

The resulting iterator is only valid if the number of characters to skip is less than or equal to the number of characters left in the iterator.

def

String.Iterator.nextn
    : String.Iterator → Nat → String.Iterator

Moves the iterator's position several characters forward.

The resulting iterator is only valid if the number of characters to skip is less than or equal to the number of characters left in the iterator.

def

String.Iterator.hasPrev : String.Iterator → Bool

True if the position is not zero.

def

String.Iterator.prev
    : String.Iterator → String.Iterator

Decreases the iterator's position.

If the position is zero, this function is the identity.

def

String.Iterator.prevn
    : String.Iterator → Nat → String.Iterator

Moves the iterator's position several characters back.

If asked to go back more characters than available, stops at the beginning of the string.

def

String.Iterator.atEnd : String.Iterator → Bool

True if the iterator is past the string's last character.

def

String.Iterator.toEnd
    : String.Iterator → String.Iterator

Moves the iterator's position to the end of the string.

Note that i.toEnd.atEnd is always true.

def

String.Iterator.setCurr
    : String.Iterator → Char → String.Iterator

Replaces the current character in the string.

Does nothing if the iterator is at the end of the string. If the iterator contains the only reference to its string, this function will mutate the string in-place instead of allocating a new one.

def

String.Iterator.extract
    : String.Iterator → String.Iterator → String

Extracts the substring between the positions of two iterators.

Returns the empty string if the iterators are for different strings, or if the position of the first iterator is past the position of the second iterator.

def

String.Iterator.remainingToString
    : String.Iterator → String

The remaining characters in an iterator, as a string.

def

String.Iterator.remainingBytes
    : String.Iterator → Nat

Number of bytes remaining in the iterator.

def

String.Iterator.pos (self : String.Iterator)
    : String.Pos

The current position.

12.8.4.10. 部分文字列（Substrings）🔗

def

String.toSubstring (s : String) : Substring

Convert a String into a Substring denoting the entire string.

def

String.toSubstring' (s : String) : Substring

String.toSubstring without [inline] annotation.

12.8.4.11. 証明自動化（Proof Automation）🔗

def

String.reduceGT : Lean.Meta.Simp.Simproc

def

String.reduceGE : Lean.Meta.Simp.Simproc

def

String.reduceBinPred (declName : Lean.Name)
    (arity : Nat) (op : String → String → Bool)
    (e : Lean.Expr)
    : Lean.Meta.SimpM Lean.Meta.Simp.Step

def

String.reduceLT : Lean.Meta.Simp.Simproc

def

String.reduceLE : Lean.Meta.Simp.Simproc

def

String.reduceBEq : Lean.Meta.Simp.DSimproc

def

String.reduceEq : Lean.Meta.Simp.Simproc

def

String.reduceAppend : Lean.Meta.Simp.DSimproc

def

String.reduceMk : Lean.Meta.Simp.DSimproc

def

String.reduceBoolPred (declName : Lean.Name)
    (arity : Nat) (op : String → String → Bool)
    (e : Lean.Expr)
    : Lean.Meta.SimpM Lean.Meta.Simp.DStep

def

String.reduceBNe : Lean.Meta.Simp.DSimproc

def

String.reduceNe : Lean.Meta.Simp.Simproc

12.8.4.12. メタプログラミング（Metaprogramming）🔗

def

String.toName (s : String) : Lean.Name

Converts a String to a hierarchical Name after splitting it at the dots.

"a.b".toName is the name a.b, not «a.b». For the latter, use Name.mkSimple.

def

String.toFileMap (s : String) : Lean.FileMap

def

String.quote (s : String) : String

def

String.fromExpr? (e : Lean.Expr)
    : Lean.Meta.SimpM (Option String)

12.8.4.13. エンコード（Encodings）🔗

def

String.utf16PosToCodepointPos (s : String)
    (pos : Nat) : Nat

def

String.utf16PosToCodepointPosFrom (s : String)
    (utf16pos : Nat) (off : String.Pos) : Nat

Computes the position of the Unicode codepoint at UTF-16 offset utf16pos in the substring of s starting at UTF-8 offset off.

def

String.codepointPosToUtf16Pos (s : String)
    (pos : Nat) : Nat

def

String.codepointPosToUtf16PosFrom (s : String)
    (n : Nat) (off : String.Pos) : Nat

Computes the UTF-16 offset of the n-th Unicode codepoint in the substring of s starting at UTF-8 offset off. Yes, this is actually useful.

def

String.getUtf8Byte (s : String) (n : Nat)
    (h : n < s.utf8ByteSize) : UInt8

Accesses a byte in the UTF-8 encoding of the String. O(1)

def

String.utf8ByteSize : String → Nat

The UTF-8 byte length of this string. This is overridden by the compiler to be cached and O(1).

def

String.utf8EncodeChar (c : Char) : List UInt8

def

String.utf8DecodeChar? (a : ByteArray) (i : Nat)
    : Option Char

def

String.fromUTF8 (a : ByteArray)
    (h : String.validateUTF8 a = true) : String

Converts a UTF-8 encoded ByteArray string to String.

def

String.fromUTF8? (a : ByteArray) : Option String

Converts a UTF-8 encoded ByteArray string to String, or returns none if a is not properly UTF-8 encoded.

def

String.fromUTF8! (a : ByteArray) : String

Converts a UTF-8 encoded ByteArray string to String, or panics if a is not properly UTF-8 encoded.

def

String.toUTF8 (a : String) : ByteArray

Converts the given String to a UTF-8 encoded byte array.

def

String.validateUTF8 (a : ByteArray) : Bool

Returns true if the given byte array consists of valid UTF-8.

def

String.utf16Length (s : String) : Nat

def

String.codepointPosToUtf8PosFrom (s : String)
    : String.Pos → Nat → String.Pos

Starting at utf8pos, finds the UTF-8 offset of the p-th codepoint.

def

String.crlfToLf (text : String) : String

Replaces each \r\n with \n to normalize line endings, but does not validate that there are no isolated \r characters. It is an optimized version of String.replace text "\r\n" "\n".

12.8.5. FFI🔗

def

String.csize (c : Char) : Nat

FFI type

typedef struct {
    lean_object m_header;
    /* byte length including '\0' terminator */
    size_t      m_size;
    size_t      m_capacity;
    /* UTF8 length */
    size_t      m_length;
    char        m_data[0];
} lean_string_object;

C での文字列の表現。詳細は String のランタイムについての説明を参照してください。

FFI function

bool lean_is_string(lean_object * o)

o が文字列であれば true を、そうでなければ false を返します。

FFI function

lean_string_object * lean_to_string(lean_object * o)

o が本当に文字列であることのランタイムチェックの実行。o が文字列でない場合は失敗を主張します。

Planned Content

Complete C API for String

Tracked at issue #158