Coverage Report

Created: 2025-06-23 13:53

next uncovered line (L), next uncovered region (R), next uncovered branch (B)
/build/cargo-vendor-dir/regex-automata-0.4.9/src/util/primitives.rs
Line
Count
Source
1
/*!
2
Lower level primitive types that are useful in a variety of circumstances.
3
4
# Overview
5
6
This list represents the principle types in this module and briefly describes
7
when you might want to use them.
8
9
* [`PatternID`] - A type that represents the identifier of a regex pattern.
10
This is probably the most widely used type in this module (which is why it's
11
also re-exported in the crate root).
12
* [`StateID`] - A type the represents the identifier of a finite automaton
13
state. This is used for both NFAs and DFAs, with the notable exception of
14
the hybrid NFA/DFA. (The hybrid NFA/DFA uses a special purpose "lazy" state
15
identifier.)
16
* [`SmallIndex`] - The internal representation of both a `PatternID` and a
17
`StateID`. Its purpose is to serve as a type that can index memory without
18
being as big as a `usize` on 64-bit targets. The main idea behind this type
19
is that there are many things in regex engines that will, in practice, never
20
overflow a 32-bit integer. (For example, like the number of patterns in a regex
21
or the number of states in an NFA.) Thus, a `SmallIndex` can be used to index
22
memory without peppering `as` casts everywhere. Moreover, it forces callers
23
to handle errors in the case where, somehow, the value would otherwise overflow
24
either a 32-bit integer or a `usize` (e.g., on 16-bit targets).
25
* [`NonMaxUsize`] - Represents a `usize` that cannot be `usize::MAX`. As a
26
result, `Option<NonMaxUsize>` has the same size in memory as a `usize`. This
27
useful, for example, when representing the offsets of submatches since it
28
reduces memory usage by a factor of 2. It is a legal optimization since Rust
29
guarantees that slices never have a length that exceeds `isize::MAX`.
30
*/
31
32
use core::num::NonZeroUsize;
33
34
#[cfg(feature = "alloc")]
35
use alloc::vec::Vec;
36
37
use crate::util::int::{Usize, U16, U32, U64};
38
39
/// A `usize` that can never be `usize::MAX`.
40
///
41
/// This is similar to `core::num::NonZeroUsize`, but instead of not permitting
42
/// a zero value, this does not permit a max value.
43
///
44
/// This is useful in certain contexts where one wants to optimize the memory
45
/// usage of things that contain match offsets. Namely, since Rust slices
46
/// are guaranteed to never have a length exceeding `isize::MAX`, we can use
47
/// `usize::MAX` as a sentinel to indicate that no match was found. Indeed,
48
/// types like `Option<NonMaxUsize>` have exactly the same size in memory as a
49
/// `usize`.
50
///
51
/// This type is defined to be `repr(transparent)` for
52
/// `core::num::NonZeroUsize`, which is in turn defined to be
53
/// `repr(transparent)` for `usize`.
54
#[derive(Clone, Copy, Eq, Hash, PartialEq, PartialOrd, Ord)]
55
#[repr(transparent)]
56
pub struct NonMaxUsize(NonZeroUsize);
57
58
impl NonMaxUsize {
59
    /// Create a new `NonMaxUsize` from the given value.
60
    ///
61
    /// This returns `None` only when the given value is equal to `usize::MAX`.
62
    #[inline]
63
0
    pub fn new(value: usize) -> Option<NonMaxUsize> {
64
0
        NonZeroUsize::new(value.wrapping_add(1)).map(NonMaxUsize)
65
0
    }
66
67
    /// Return the underlying `usize` value. The returned value is guaranteed
68
    /// to not equal `usize::MAX`.
69
    #[inline]
70
0
    pub fn get(self) -> usize {
71
0
        self.0.get().wrapping_sub(1)
72
0
    }
73
}
74
75
// We provide our own Debug impl because seeing the internal repr can be quite
76
// surprising if you aren't expecting it. e.g., 'NonMaxUsize(5)' vs just '5'.
77
impl core::fmt::Debug for NonMaxUsize {
78
0
    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
79
0
        write!(f, "{:?}", self.get())
80
0
    }
81
}
82
83
/// A type that represents a "small" index.
84
///
85
/// The main idea of this type is to provide something that can index memory,
86
/// but uses less memory than `usize` on 64-bit systems. Specifically, its
87
/// representation is always a `u32` and has `repr(transparent)` enabled. (So
88
/// it is safe to transmute between a `u32` and a `SmallIndex`.)
89
///
90
/// A small index is typically useful in cases where there is no practical way
91
/// that the index will overflow a 32-bit integer. A good example of this is
92
/// an NFA state. If you could somehow build an NFA with `2^30` states, its
93
/// memory usage would be exorbitant and its runtime execution would be so
94
/// slow as to be completely worthless. Therefore, this crate generally deems
95
/// it acceptable to return an error if it would otherwise build an NFA that
96
/// requires a slice longer than what a 32-bit integer can index. In exchange,
97
/// we can use 32-bit indices instead of 64-bit indices in various places.
98
///
99
/// This type ensures this by providing a constructor that will return an error
100
/// if its argument cannot fit into the type. This makes it much easier to
101
/// handle these sorts of boundary cases that are otherwise extremely subtle.
102
///
103
/// On all targets, this type guarantees that its value will fit in a `u32`,
104
/// `i32`, `usize` and an `isize`. This means that on 16-bit targets, for
105
/// example, this type's maximum value will never overflow an `isize`,
106
/// which means it will never overflow a `i16` even though its internal
107
/// representation is still a `u32`.
108
///
109
/// The purpose for making the type fit into even signed integer types like
110
/// `isize` is to guarantee that the difference between any two small indices
111
/// is itself also a small index. This is useful in certain contexts, e.g.,
112
/// for delta encoding.
113
///
114
/// # Other types
115
///
116
/// The following types wrap `SmallIndex` to provide a more focused use case:
117
///
118
/// * [`PatternID`] is for representing the identifiers of patterns.
119
/// * [`StateID`] is for representing the identifiers of states in finite
120
/// automata. It is used for both NFAs and DFAs.
121
///
122
/// # Representation
123
///
124
/// This type is always represented internally by a `u32` and is marked as
125
/// `repr(transparent)`. Thus, this type always has the same representation as
126
/// a `u32`. It is thus safe to transmute between a `u32` and a `SmallIndex`.
127
///
128
/// # Indexing
129
///
130
/// For convenience, callers may use a `SmallIndex` to index slices.
131
///
132
/// # Safety
133
///
134
/// While a `SmallIndex` is meant to guarantee that its value fits into `usize`
135
/// without using as much space as a `usize` on all targets, callers must
136
/// not rely on this property for safety. Callers may choose to rely on this
137
/// property for correctness however. For example, creating a `SmallIndex` with
138
/// an invalid value can be done in entirely safe code. This may in turn result
139
/// in panics or silent logical errors.
140
#[derive(
141
    Clone, Copy, Debug, Default, Eq, Hash, PartialEq, PartialOrd, Ord,
142
)]
143
#[repr(transparent)]
144
pub struct SmallIndex(u32);
145
146
impl SmallIndex {
147
    /// The maximum index value.
148
    #[cfg(any(target_pointer_width = "32", target_pointer_width = "64"))]
149
    pub const MAX: SmallIndex =
150
        // FIXME: Use as_usize() once const functions in traits are stable.
151
        SmallIndex::new_unchecked(core::i32::MAX as usize - 1);
152
153
    /// The maximum index value.
154
    #[cfg(target_pointer_width = "16")]
155
    pub const MAX: SmallIndex =
156
        SmallIndex::new_unchecked(core::isize::MAX - 1);
157
158
    /// The total number of values that can be represented as a small index.
159
    pub const LIMIT: usize = SmallIndex::MAX.as_usize() + 1;
160
161
    /// The zero index value.
162
    pub const ZERO: SmallIndex = SmallIndex::new_unchecked(0);
163
164
    /// The number of bytes that a single small index uses in memory.
165
    pub const SIZE: usize = core::mem::size_of::<SmallIndex>();
166
167
    /// Create a new small index.
168
    ///
169
    /// If the given index exceeds [`SmallIndex::MAX`], then this returns
170
    /// an error.
171
    #[inline]
172
326
    pub fn new(index: usize) -> Result<SmallIndex, SmallIndexError> {
173
326
        SmallIndex::try_from(index)
174
326
    }
175
176
    /// Create a new small index without checking whether the given value
177
    /// exceeds [`SmallIndex::MAX`].
178
    ///
179
    /// Using this routine with an invalid index value will result in
180
    /// unspecified behavior, but *not* undefined behavior. In particular, an
181
    /// invalid index value is likely to cause panics or possibly even silent
182
    /// logical errors.
183
    ///
184
    /// Callers must never rely on a `SmallIndex` to be within a certain range
185
    /// for memory safety.
186
    #[inline]
187
1.82k
    pub const fn new_unchecked(index: usize) -> SmallIndex {
188
1.82k
        // FIXME: Use as_u32() once const functions in traits are stable.
189
1.82k
        SmallIndex(index as u32)
190
1.82k
    }
191
192
    /// Like [`SmallIndex::new`], but panics if the given index is not valid.
193
    #[inline]
194
0
    pub fn must(index: usize) -> SmallIndex {
195
0
        SmallIndex::new(index).expect("invalid small index")
196
0
    }
197
198
    /// Return this small index as a `usize`. This is guaranteed to never
199
    /// overflow `usize`.
200
    #[inline]
201
3.77k
    pub const fn as_usize(&self) -> usize {
202
3.77k
        // FIXME: Use as_usize() once const functions in traits are stable.
203
3.77k
        self.0 as usize
204
3.77k
    }
205
206
    /// Return this small index as a `u64`. This is guaranteed to never
207
    /// overflow.
208
    #[inline]
209
952
    pub const fn as_u64(&self) -> u64 {
210
952
        // FIXME: Use u64::from() once const functions in traits are stable.
211
952
        self.0 as u64
212
952
    }
213
214
    /// Return the internal `u32` of this small index. This is guaranteed to
215
    /// never overflow `u32`.
216
    #[inline]
217
6
    pub const fn as_u32(&self) -> u32 {
218
6
        self.0
219
6
    }
220
221
    /// Return the internal `u32` of this small index represented as an `i32`.
222
    /// This is guaranteed to never overflow an `i32`.
223
    #[inline]
224
158
    pub const fn as_i32(&self) -> i32 {
225
158
        // This is OK because we guarantee that our max value is <= i32::MAX.
226
158
        self.0 as i32
227
158
    }
228
229
    /// Returns one more than this small index as a usize.
230
    ///
231
    /// Since a small index has constraints on its maximum value, adding `1` to
232
    /// it will always fit in a `usize`, `u32` and a `i32`.
233
    #[inline]
234
3
    pub fn one_more(&self) -> usize {
235
3
        self.as_usize() + 1
236
3
    }
237
238
    /// Decode this small index from the bytes given using the native endian
239
    /// byte order for the current target.
240
    ///
241
    /// If the decoded integer is not representable as a small index for the
242
    /// current target, then this returns an error.
243
    #[inline]
244
0
    pub fn from_ne_bytes(
245
0
        bytes: [u8; 4],
246
0
    ) -> Result<SmallIndex, SmallIndexError> {
247
0
        let id = u32::from_ne_bytes(bytes);
248
0
        if id > SmallIndex::MAX.as_u32() {
249
0
            return Err(SmallIndexError { attempted: u64::from(id) });
250
0
        }
251
0
        Ok(SmallIndex::new_unchecked(id.as_usize()))
252
0
    }
253
254
    /// Decode this small index from the bytes given using the native endian
255
    /// byte order for the current target.
256
    ///
257
    /// This is analogous to [`SmallIndex::new_unchecked`] in that is does not
258
    /// check whether the decoded integer is representable as a small index.
259
    #[inline]
260
0
    pub fn from_ne_bytes_unchecked(bytes: [u8; 4]) -> SmallIndex {
261
0
        SmallIndex::new_unchecked(u32::from_ne_bytes(bytes).as_usize())
262
0
    }
263
264
    /// Return the underlying small index integer as raw bytes in native endian
265
    /// format.
266
    #[inline]
267
0
    pub fn to_ne_bytes(&self) -> [u8; 4] {
268
0
        self.0.to_ne_bytes()
269
0
    }
270
}
271
272
impl<T> core::ops::Index<SmallIndex> for [T] {
273
    type Output = T;
274
275
    #[inline]
276
0
    fn index(&self, index: SmallIndex) -> &T {
277
0
        &self[index.as_usize()]
278
0
    }
279
}
280
281
impl<T> core::ops::IndexMut<SmallIndex> for [T] {
282
    #[inline]
283
0
    fn index_mut(&mut self, index: SmallIndex) -> &mut T {
284
0
        &mut self[index.as_usize()]
285
0
    }
286
}
287
288
#[cfg(feature = "alloc")]
289
impl<T> core::ops::Index<SmallIndex> for Vec<T> {
290
    type Output = T;
291
292
    #[inline]
293
0
    fn index(&self, index: SmallIndex) -> &T {
294
0
        &self[index.as_usize()]
295
0
    }
296
}
297
298
#[cfg(feature = "alloc")]
299
impl<T> core::ops::IndexMut<SmallIndex> for Vec<T> {
300
    #[inline]
301
0
    fn index_mut(&mut self, index: SmallIndex) -> &mut T {
302
0
        &mut self[index.as_usize()]
303
0
    }
304
}
305
306
impl From<u8> for SmallIndex {
307
0
    fn from(index: u8) -> SmallIndex {
308
0
        SmallIndex::new_unchecked(usize::from(index))
309
0
    }
310
}
311
312
impl TryFrom<u16> for SmallIndex {
313
    type Error = SmallIndexError;
314
315
0
    fn try_from(index: u16) -> Result<SmallIndex, SmallIndexError> {
316
0
        if u32::from(index) > SmallIndex::MAX.as_u32() {
317
0
            return Err(SmallIndexError { attempted: u64::from(index) });
318
0
        }
319
0
        Ok(SmallIndex::new_unchecked(index.as_usize()))
320
0
    }
321
}
322
323
impl TryFrom<u32> for SmallIndex {
324
    type Error = SmallIndexError;
325
326
6
    fn try_from(index: u32) -> Result<SmallIndex, SmallIndexError> {
327
6
        if index > SmallIndex::MAX.as_u32() {
328
0
            return Err(SmallIndexError { attempted: u64::from(index) });
329
6
        }
330
6
        Ok(SmallIndex::new_unchecked(index.as_usize()))
331
6
    }
332
}
333
334
impl TryFrom<u64> for SmallIndex {
335
    type Error = SmallIndexError;
336
337
0
    fn try_from(index: u64) -> Result<SmallIndex, SmallIndexError> {
338
0
        if index > SmallIndex::MAX.as_u64() {
339
0
            return Err(SmallIndexError { attempted: index });
340
0
        }
341
0
        Ok(SmallIndex::new_unchecked(index.as_usize()))
342
0
    }
343
}
344
345
impl TryFrom<usize> for SmallIndex {
346
    type Error = SmallIndexError;
347
348
356
    fn try_from(index: usize) -> Result<SmallIndex, SmallIndexError> {
349
356
        if index > SmallIndex::MAX.as_usize() {
350
0
            return Err(SmallIndexError { attempted: index.as_u64() });
351
356
        }
352
356
        Ok(SmallIndex::new_unchecked(index))
353
356
    }
354
}
355
356
#[cfg(test)]
357
impl quickcheck::Arbitrary for SmallIndex {
358
    fn arbitrary(gen: &mut quickcheck::Gen) -> SmallIndex {
359
        use core::cmp::max;
360
361
        let id = max(i32::MIN + 1, i32::arbitrary(gen)).abs();
362
        if id > SmallIndex::MAX.as_i32() {
363
            SmallIndex::MAX
364
        } else {
365
            SmallIndex::new(usize::try_from(id).unwrap()).unwrap()
366
        }
367
    }
368
}
369
370
/// This error occurs when a small index could not be constructed.
371
///
372
/// This occurs when given an integer exceeding the maximum small index value.
373
///
374
/// When the `std` feature is enabled, this implements the `Error` trait.
375
#[derive(Clone, Debug, Eq, PartialEq)]
376
pub struct SmallIndexError {
377
    attempted: u64,
378
}
379
380
impl SmallIndexError {
381
    /// Returns the value that could not be converted to a small index.
382
0
    pub fn attempted(&self) -> u64 {
383
0
        self.attempted
384
0
    }
385
}
386
387
#[cfg(feature = "std")]
388
impl std::error::Error for SmallIndexError {}
389
390
impl core::fmt::Display for SmallIndexError {
391
0
    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
392
0
        write!(
393
0
            f,
394
0
            "failed to create small index from {:?}, which exceeds {:?}",
395
0
            self.attempted(),
396
0
            SmallIndex::MAX,
397
0
        )
398
0
    }
399
}
400
401
#[derive(Clone, Debug)]
402
pub(crate) struct SmallIndexIter {
403
    rng: core::ops::Range<usize>,
404
}
405
406
impl Iterator for SmallIndexIter {
407
    type Item = SmallIndex;
408
409
144
    fn next(&mut self) -> Option<SmallIndex> {
410
144
        if self.rng.start >= self.rng.end {
411
1
            return None;
412
143
        }
413
143
        let next_id = self.rng.start + 1;
414
143
        let id = core::mem::replace(&mut self.rng.start, next_id);
415
143
        // new_unchecked is OK since we asserted that the number of
416
143
        // elements in this iterator will fit in an ID at construction.
417
143
        Some(SmallIndex::new_unchecked(id))
418
144
    }
419
}
420
421
macro_rules! index_type_impls {
422
    ($name:ident, $err:ident, $iter:ident, $withiter:ident) => {
423
        impl $name {
424
            /// The maximum value.
425
            pub const MAX: $name = $name(SmallIndex::MAX);
426
427
            /// The total number of values that can be represented.
428
            pub const LIMIT: usize = SmallIndex::LIMIT;
429
430
            /// The zero value.
431
            pub const ZERO: $name = $name(SmallIndex::ZERO);
432
433
            /// The number of bytes that a single value uses in memory.
434
            pub const SIZE: usize = SmallIndex::SIZE;
435
436
            /// Create a new value that is represented by a "small index."
437
            ///
438
            /// If the given index exceeds the maximum allowed value, then this
439
            /// returns an error.
440
            #[inline]
441
314
            pub fn new(value: usize) -> Result<$name, $err> {
442
314
                SmallIndex::new(value).map($name).map_err($err)
443
314
            }
444
445
            /// Create a new value without checking whether the given argument
446
            /// exceeds the maximum.
447
            ///
448
            /// Using this routine with an invalid value will result in
449
            /// unspecified behavior, but *not* undefined behavior. In
450
            /// particular, an invalid ID value is likely to cause panics or
451
            /// possibly even silent logical errors.
452
            ///
453
            /// Callers must never rely on this type to be within a certain
454
            /// range for memory safety.
455
            #[inline]
456
1.31k
            pub const fn new_unchecked(value: usize) -> $name {
457
1.31k
                $name(SmallIndex::new_unchecked(value))
458
1.31k
            }
459
460
            /// Like `new`, but panics if the given value is not valid.
461
            #[inline]
462
23
            pub fn must(value: usize) -> $name {
463
23
                $name::new(value).expect(concat!(
464
23
                    "invalid ",
465
23
                    stringify!($name),
466
23
                    " value"
467
23
                ))
468
23
            }
469
470
            /// Return the internal value as a `usize`. This is guaranteed to
471
            /// never overflow `usize`.
472
            #[inline]
473
3.36k
            pub const fn as_usize(&self) -> usize {
474
3.36k
                self.0.as_usize()
475
3.36k
            }
476
477
            /// Return the internal value as a `u64`. This is guaranteed to
478
            /// never overflow.
479
            #[inline]
480
952
            pub const fn as_u64(&self) -> u64 {
481
952
                self.0.as_u64()
482
952
            }
483
484
            /// Return the internal value as a `u32`. This is guaranteed to
485
            /// never overflow `u32`.
486
            #[inline]
487
0
            pub const fn as_u32(&self) -> u32 {
488
0
                self.0.as_u32()
489
0
            }
490
491
            /// Return the internal value as a i32`. This is guaranteed to
492
            /// never overflow an `i32`.
493
            #[inline]
494
158
            pub const fn as_i32(&self) -> i32 {
495
158
                self.0.as_i32()
496
158
            }
497
498
            /// Returns one more than this value as a usize.
499
            ///
500
            /// Since values represented by a "small index" have constraints
501
            /// on their maximum value, adding `1` to it will always fit in a
502
            /// `usize`, `u32` and a `i32`.
503
            #[inline]
504
1
            pub fn one_more(&self) -> usize {
505
1
                self.0.one_more()
506
1
            }
507
508
            /// Decode this value from the bytes given using the native endian
509
            /// byte order for the current target.
510
            ///
511
            /// If the decoded integer is not representable as a small index
512
            /// for the current target, then this returns an error.
513
            #[inline]
514
0
            pub fn from_ne_bytes(bytes: [u8; 4]) -> Result<$name, $err> {
515
0
                SmallIndex::from_ne_bytes(bytes).map($name).map_err($err)
516
0
            }
517
518
            /// Decode this value from the bytes given using the native endian
519
            /// byte order for the current target.
520
            ///
521
            /// This is analogous to `new_unchecked` in that is does not check
522
            /// whether the decoded integer is representable as a small index.
523
            #[inline]
524
0
            pub fn from_ne_bytes_unchecked(bytes: [u8; 4]) -> $name {
525
0
                $name(SmallIndex::from_ne_bytes_unchecked(bytes))
526
0
            }
527
528
            /// Return the underlying integer as raw bytes in native endian
529
            /// format.
530
            #[inline]
531
0
            pub fn to_ne_bytes(&self) -> [u8; 4] {
532
0
                self.0.to_ne_bytes()
533
0
            }
534
535
            /// Returns an iterator over all values from 0 up to and not
536
            /// including the given length.
537
            ///
538
            /// If the given length exceeds this type's limit, then this
539
            /// panics.
540
13
            pub(crate) fn iter(len: usize) -> $iter {
541
13
                $iter::new(len)
542
13
            }
543
        }
544
545
        // We write our own Debug impl so that we get things like PatternID(5)
546
        // instead of PatternID(SmallIndex(5)).
547
        impl core::fmt::Debug for $name {
548
0
            fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
549
0
                f.debug_tuple(stringify!($name)).field(&self.as_u32()).finish()
550
0
            }
551
        }
552
553
        impl<T> core::ops::Index<$name> for [T] {
554
            type Output = T;
555
556
            #[inline]
557
589
            fn index(&self, index: $name) -> &T {
558
589
                &self[index.as_usize()]
559
589
            }
560
        }
561
562
        impl<T> core::ops::IndexMut<$name> for [T] {
563
            #[inline]
564
0
            fn index_mut(&mut self, index: $name) -> &mut T {
565
0
                &mut self[index.as_usize()]
566
0
            }
567
        }
568
569
        #[cfg(feature = "alloc")]
570
        impl<T> core::ops::Index<$name> for Vec<T> {
571
            type Output = T;
572
573
            #[inline]
574
544
            fn index(&self, index: $name) -> &T {
575
544
                &self[index.as_usize()]
576
544
            }
577
        }
578
579
        #[cfg(feature = "alloc")]
580
        impl<T> core::ops::IndexMut<$name> for Vec<T> {
581
            #[inline]
582
953
            fn index_mut(&mut self, index: $name) -> &mut T {
583
953
                &mut self[index.as_usize()]
584
953
            }
585
        }
586
587
        impl From<u8> for $name {
588
0
            fn from(value: u8) -> $name {
589
0
                $name(SmallIndex::from(value))
590
0
            }
591
        }
592
593
        impl TryFrom<u16> for $name {
594
            type Error = $err;
595
596
0
            fn try_from(value: u16) -> Result<$name, $err> {
597
0
                SmallIndex::try_from(value).map($name).map_err($err)
598
0
            }
599
        }
600
601
        impl TryFrom<u32> for $name {
602
            type Error = $err;
603
604
0
            fn try_from(value: u32) -> Result<$name, $err> {
605
0
                SmallIndex::try_from(value).map($name).map_err($err)
606
0
            }
607
        }
608
609
        impl TryFrom<u64> for $name {
610
            type Error = $err;
611
612
0
            fn try_from(value: u64) -> Result<$name, $err> {
613
0
                SmallIndex::try_from(value).map($name).map_err($err)
614
0
            }
615
        }
616
617
        impl TryFrom<usize> for $name {
618
            type Error = $err;
619
620
30
            fn try_from(value: usize) -> Result<$name, $err> {
621
30
                SmallIndex::try_from(value).map($name).map_err($err)
622
30
            }
623
        }
624
625
        #[cfg(test)]
626
        impl quickcheck::Arbitrary for $name {
627
            fn arbitrary(gen: &mut quickcheck::Gen) -> $name {
628
                $name(SmallIndex::arbitrary(gen))
629
            }
630
        }
631
632
        /// This error occurs when a value could not be constructed.
633
        ///
634
        /// This occurs when given an integer exceeding the maximum allowed
635
        /// value.
636
        ///
637
        /// When the `std` feature is enabled, this implements the `Error`
638
        /// trait.
639
        #[derive(Clone, Debug, Eq, PartialEq)]
640
        pub struct $err(SmallIndexError);
641
642
        impl $err {
643
            /// Returns the value that could not be converted to an ID.
644
0
            pub fn attempted(&self) -> u64 {
645
0
                self.0.attempted()
646
0
            }
647
        }
648
649
        #[cfg(feature = "std")]
650
        impl std::error::Error for $err {}
651
652
        impl core::fmt::Display for $err {
653
0
            fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
654
0
                write!(
655
0
                    f,
656
0
                    "failed to create {} from {:?}, which exceeds {:?}",
657
0
                    stringify!($name),
658
0
                    self.attempted(),
659
0
                    $name::MAX,
660
0
                )
661
0
            }
662
        }
663
664
        #[derive(Clone, Debug)]
665
        pub(crate) struct $iter(SmallIndexIter);
666
667
        impl $iter {
668
13
            fn new(len: usize) -> $iter {
669
13
                assert!(
670
13
                    len <= $name::LIMIT,
671
0
                    "cannot create iterator for {} when number of \
672
0
                     elements exceed {:?}",
673
                    stringify!($name),
674
                    $name::LIMIT,
675
                );
676
13
                $iter(SmallIndexIter { rng: 0..len })
677
13
            }
678
        }
679
680
        impl Iterator for $iter {
681
            type Item = $name;
682
683
144
            fn next(&mut self) -> Option<$name> {
684
144
                self.0.next().map($name)
685
144
            }
686
        }
687
688
        /// An iterator adapter that is like std::iter::Enumerate, but attaches
689
        /// small index values instead. It requires `ExactSizeIterator`. At
690
        /// construction, it ensures that the index of each element in the
691
        /// iterator is representable in the corresponding small index type.
692
        #[derive(Clone, Debug)]
693
        pub(crate) struct $withiter<I> {
694
            it: I,
695
            ids: $iter,
696
        }
697
698
        impl<I: Iterator + ExactSizeIterator> $withiter<I> {
699
12
            fn new(it: I) -> $withiter<I> {
700
12
                let ids = $name::iter(it.len());
701
12
                $withiter { it, ids }
702
12
            }
703
        }
704
705
        impl<I: Iterator + ExactSizeIterator> Iterator for $withiter<I> {
706
            type Item = ($name, I::Item);
707
708
154
            fn next(&mut self) -> Option<($name, I::Item)> {
709
154
                let 
item142
= self.it.next()
?12
;
710
                // Number of elements in this iterator must match, according
711
                // to contract of ExactSizeIterator.
712
142
                let id = self.ids.next().unwrap();
713
142
                Some((id, item))
714
154
            }
715
        }
716
    };
717
}
718
719
/// The identifier of a regex pattern, represented by a [`SmallIndex`].
720
///
721
/// The identifier for a pattern corresponds to its relative position among
722
/// other patterns in a single finite state machine. Namely, when building
723
/// a multi-pattern regex engine, one must supply a sequence of patterns to
724
/// match. The position (starting at 0) of each pattern in that sequence
725
/// represents its identifier. This identifier is in turn used to identify and
726
/// report matches of that pattern in various APIs.
727
///
728
/// See the [`SmallIndex`] type for more information about what it means for
729
/// a pattern ID to be a "small index."
730
///
731
/// Note that this type is defined in the
732
/// [`util::primitives`](crate::util::primitives) module, but it is also
733
/// re-exported at the crate root due to how common it is.
734
#[derive(Clone, Copy, Default, Eq, Hash, PartialEq, PartialOrd, Ord)]
735
#[repr(transparent)]
736
pub struct PatternID(SmallIndex);
737
738
/// The identifier of a finite automaton state, represented by a
739
/// [`SmallIndex`].
740
///
741
/// Most regex engines in this crate are built on top of finite automata. Each
742
/// state in a finite automaton defines transitions from its state to another.
743
/// Those transitions point to other states via their identifiers, i.e., a
744
/// `StateID`. Since finite automata tend to contain many transitions, it is
745
/// much more memory efficient to define state IDs as small indices.
746
///
747
/// See the [`SmallIndex`] type for more information about what it means for
748
/// a state ID to be a "small index."
749
#[derive(Clone, Copy, Default, Eq, Hash, PartialEq, PartialOrd, Ord)]
750
#[repr(transparent)]
751
pub struct StateID(SmallIndex);
752
753
index_type_impls!(PatternID, PatternIDError, PatternIDIter, WithPatternIDIter);
754
index_type_impls!(StateID, StateIDError, StateIDIter, WithStateIDIter);
755
756
/// A utility trait that defines a couple of adapters for making it convenient
757
/// to access indices as "small index" types. We require ExactSizeIterator so
758
/// that iterator construction can do a single check to make sure the index of
759
/// each element is representable by its small index type.
760
pub(crate) trait IteratorIndexExt: Iterator {
761
8
    fn with_pattern_ids(self) -> WithPatternIDIter<Self>
762
8
    where
763
8
        Self: Sized + ExactSizeIterator,
764
8
    {
765
8
        WithPatternIDIter::new(self)
766
8
    }
767
768
4
    fn with_state_ids(self) -> WithStateIDIter<Self>
769
4
    where
770
4
        Self: Sized + ExactSizeIterator,
771
4
    {
772
4
        WithStateIDIter::new(self)
773
4
    }
774
}
775
776
impl<I: Iterator> IteratorIndexExt for I {}