Skip to content

Selections

hgp_lib.selections.tournament_selection.TournamentSelection

Bases: BaseSelection

Selection strategy that holds tournaments among randomly sampled subsets of the population.

The population is effectively sorted by fitness. For each selection event, tournament_size candidates are randomly sampled from the population. The candidate with the best rank in the sample is selected with probability p. If not selected, the second best is considered with probability p, and so on.

The selection probability for the i-th ranked individual in a tournament is: P(i) = selection_p * (1 - selection_p)^i

This creates selection pressure that favors fitter individuals while maintaining diversity. Higher tournament_size or selection_p increases selection pressure.

Parameters:

Name Type Description Default
tournament_size int

The number of candidates to include in each tournament. Must be greater than 0. Default: 10.

10
selection_p float

The probability of selecting the best candidate in the tournament. Must be between 0.0 and 1.0. Lower values reduce selection pressure. Default: 0.4.

0.4

Raises:

Type Description
TypeError

If tournament_size is not an int or selection_p is not a float.

ValueError

If selection_p is not in [0, 1] or tournament_size <= 0.

Examples:

>>> import numpy as np
>>> from hgp_lib.selections import TournamentSelection
>>> from hgp_lib.rules import Literal
>>> np.random.seed(42)
>>> selection = TournamentSelection(tournament_size=3, selection_p=0.5)
>>> rules = [Literal(value=i) for i in range(5)]
>>> scores = [0.1, 0.9, 0.5, 0.3, 0.7]
>>> selected_rules, selected_scores = selection.select(rules, scores, 3)
>>> len(selected_rules)
3
Source code in hgp_lib\selections\tournament_selection.py
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
class TournamentSelection(BaseSelection):
    """
    Selection strategy that holds tournaments among randomly sampled subsets of the population.

    The population is effectively sorted by fitness. For each selection event,
    `tournament_size` candidates are randomly sampled from the population.
    The candidate with the best rank in the sample is selected with probability `p`.
    If not selected, the second best is considered with probability `p`, and so on.

    The selection probability for the `i`-th ranked individual in a tournament is:
        `P(i) = selection_p * (1 - selection_p)^i`

    This creates selection pressure that favors fitter individuals while maintaining
    diversity. Higher `tournament_size` or `selection_p` increases selection pressure.

    Args:
        tournament_size (int):
            The number of candidates to include in each tournament. Must be greater
            than `0`. Default: `10`.
        selection_p (float):
            The probability of selecting the best candidate in the tournament.
            Must be between `0.0` and `1.0`. Lower values reduce selection pressure.
            Default: `0.4`.

    Raises:
        TypeError: If `tournament_size` is not an int or `selection_p` is not a float.
        ValueError: If `selection_p` is not in [0, 1] or `tournament_size` <= 0.

    Examples:
        >>> import numpy as np
        >>> from hgp_lib.selections import TournamentSelection
        >>> from hgp_lib.rules import Literal
        >>> np.random.seed(42)
        >>> selection = TournamentSelection(tournament_size=3, selection_p=0.5)
        >>> rules = [Literal(value=i) for i in range(5)]
        >>> scores = [0.1, 0.9, 0.5, 0.3, 0.7]
        >>> selected_rules, selected_scores = selection.select(rules, scores, 3)
        >>> len(selected_rules)
        3
    """

    def __init__(self, tournament_size: int = 10, selection_p: float = 0.4):
        check_isinstance(tournament_size, int)
        check_isinstance(selection_p, float)
        if selection_p < 0 or selection_p > 1:
            raise ValueError(
                f"selection_p must be a float between 0 and 1, is {selection_p}"
            )
        if tournament_size <= 0:
            raise ValueError(
                f"tournament_size must be greater than 0, is {tournament_size}"
            )

        self.tournament_size = tournament_size
        self.selection_p = selection_p

        ranks = np.arange(tournament_size)
        probs = selection_p * ((1 - selection_p) ** ranks)
        # Normalize the tail: ensure the sum is exactly 1.0 by adding the remaining probability mass to the last element.
        probs[-1] += 1.0 - np.sum(probs)
        self.probs: np.ndarray = probs

    def select(
        self,
        rules: Sequence[Rule],
        scores: np.ndarray | Sequence[float],
        n_select: int,
    ) -> Tuple[List[Rule], ndarray]:
        """
        Selects `n_select` rules using tournament selection.

        The method performs `n_select` independent tournaments. In each tournament:
        1. `tournament_size` indices are sampled randomly from the population (without replacement).
        2. These indices are sorted by the fitness of the corresponding rules (best to worst).
        3. A winner is chosen based on the pre-calculated geometric probabilities.

        The caller must ensure that `len(rules) == `len(scores)` and `n_select >= self.tournament_size`.

        Args:
            rules (Sequence[Rule]):
                The collection of candidate rules to select from. Length must be
                greater than or equal to `tournament_size`.
            scores (np.ndarray | Sequence[float]):
                Fitness scores corresponding to each rule. Higher scores indicate better
                fitness. Must have the same length as `rules`.
            n_select (int):
                Number of rules to select.

        Returns:
            Tuple[List[Rule], ndarray]: A tuple containing:
                - List[Rule]: Copies of the selected rules. The same rule may appear multiple times.
                - ndarray: The fitness scores of the selected rules.

        Examples:
            >>> import random
            >>> import numpy as np
            >>> from hgp_lib.selections import TournamentSelection
            >>> from hgp_lib.rules import Literal
            >>> random.seed(42); np.random.seed(42)
            >>> selection = TournamentSelection(tournament_size=3, selection_p=0.5)
            >>> rules = [
            ...     Literal(value=0),
            ...     Literal(value=1),
            ...     Literal(value=2),
            ... ]
            >>> scores = [0.2, 0.8, 0.5]
            >>> selected_rules, selected_scores = selection.select(rules, scores, 2)
            >>> len(selected_rules)
            2
            >>> all(isinstance(rule, Rule) for rule in selected_rules)
            True
        """
        n = len(rules)

        scores_array = np.asarray(scores)
        sorted_order = np.argsort(-scores_array)

        winning_seats = np.random.choice(
            self.tournament_size, size=n_select, p=self.probs
        )

        # Vectorized tournament sampling: for each of n_select tournaments,
        # pick tournament_size unique indices from [0, n) by selecting the
        # tournament_size smallest random keys per row.
        random_keys = np.random.random((n_select, n))
        tournament_matrix = np.argpartition(
            random_keys, self.tournament_size - 1, axis=1
        )[:, : self.tournament_size]
        tournament_matrix.sort(axis=1)

        # Select the winning rank from each tournament, then map to the
        # original population index via sorted_order.
        winner_ranks = tournament_matrix[np.arange(n_select), winning_seats]
        winner_indices = sorted_order[winner_ranks]

        selected_rules = [rules[idx].copy() for idx in winner_indices]
        selected_scores = scores_array[winner_indices]
        return selected_rules, selected_scores

select(rules, scores, n_select)

Selects n_select rules using tournament selection.

The method performs n_select independent tournaments. In each tournament: 1. tournament_size indices are sampled randomly from the population (without replacement). 2. These indices are sorted by the fitness of the corresponding rules (best to worst). 3. A winner is chosen based on the pre-calculated geometric probabilities.

The caller must ensure that len(rules) ==len(scores)andn_select >= self.tournament_size`.

Parameters:

Name Type Description Default
rules Sequence[Rule]

The collection of candidate rules to select from. Length must be greater than or equal to tournament_size.

required
scores ndarray | Sequence[float]

Fitness scores corresponding to each rule. Higher scores indicate better fitness. Must have the same length as rules.

required
n_select int

Number of rules to select.

required

Returns:

Type Description
Tuple[List[Rule], ndarray]

Tuple[List[Rule], ndarray]: A tuple containing: - List[Rule]: Copies of the selected rules. The same rule may appear multiple times. - ndarray: The fitness scores of the selected rules.

Examples:

>>> import random
>>> import numpy as np
>>> from hgp_lib.selections import TournamentSelection
>>> from hgp_lib.rules import Literal
>>> random.seed(42); np.random.seed(42)
>>> selection = TournamentSelection(tournament_size=3, selection_p=0.5)
>>> rules = [
...     Literal(value=0),
...     Literal(value=1),
...     Literal(value=2),
... ]
>>> scores = [0.2, 0.8, 0.5]
>>> selected_rules, selected_scores = selection.select(rules, scores, 2)
>>> len(selected_rules)
2
>>> all(isinstance(rule, Rule) for rule in selected_rules)
True
Source code in hgp_lib\selections\tournament_selection.py
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
def select(
    self,
    rules: Sequence[Rule],
    scores: np.ndarray | Sequence[float],
    n_select: int,
) -> Tuple[List[Rule], ndarray]:
    """
    Selects `n_select` rules using tournament selection.

    The method performs `n_select` independent tournaments. In each tournament:
    1. `tournament_size` indices are sampled randomly from the population (without replacement).
    2. These indices are sorted by the fitness of the corresponding rules (best to worst).
    3. A winner is chosen based on the pre-calculated geometric probabilities.

    The caller must ensure that `len(rules) == `len(scores)` and `n_select >= self.tournament_size`.

    Args:
        rules (Sequence[Rule]):
            The collection of candidate rules to select from. Length must be
            greater than or equal to `tournament_size`.
        scores (np.ndarray | Sequence[float]):
            Fitness scores corresponding to each rule. Higher scores indicate better
            fitness. Must have the same length as `rules`.
        n_select (int):
            Number of rules to select.

    Returns:
        Tuple[List[Rule], ndarray]: A tuple containing:
            - List[Rule]: Copies of the selected rules. The same rule may appear multiple times.
            - ndarray: The fitness scores of the selected rules.

    Examples:
        >>> import random
        >>> import numpy as np
        >>> from hgp_lib.selections import TournamentSelection
        >>> from hgp_lib.rules import Literal
        >>> random.seed(42); np.random.seed(42)
        >>> selection = TournamentSelection(tournament_size=3, selection_p=0.5)
        >>> rules = [
        ...     Literal(value=0),
        ...     Literal(value=1),
        ...     Literal(value=2),
        ... ]
        >>> scores = [0.2, 0.8, 0.5]
        >>> selected_rules, selected_scores = selection.select(rules, scores, 2)
        >>> len(selected_rules)
        2
        >>> all(isinstance(rule, Rule) for rule in selected_rules)
        True
    """
    n = len(rules)

    scores_array = np.asarray(scores)
    sorted_order = np.argsort(-scores_array)

    winning_seats = np.random.choice(
        self.tournament_size, size=n_select, p=self.probs
    )

    # Vectorized tournament sampling: for each of n_select tournaments,
    # pick tournament_size unique indices from [0, n) by selecting the
    # tournament_size smallest random keys per row.
    random_keys = np.random.random((n_select, n))
    tournament_matrix = np.argpartition(
        random_keys, self.tournament_size - 1, axis=1
    )[:, : self.tournament_size]
    tournament_matrix.sort(axis=1)

    # Select the winning rank from each tournament, then map to the
    # original population index via sorted_order.
    winner_ranks = tournament_matrix[np.arange(n_select), winning_seats]
    winner_indices = sorted_order[winner_ranks]

    selected_rules = [rules[idx].copy() for idx in winner_indices]
    selected_scores = scores_array[winner_indices]
    return selected_rules, selected_scores

hgp_lib.selections.roulette_selection.RouletteSelection

Bases: BaseSelection

Fitness-proportionate selection using roulette wheel sampling.

Each rule's probability of being selected is proportional to its fitness score. Higher scores receive proportionally higher selection probabilities. Negative scores are handled by shifting all scores to be non-negative before computing probabilities. Selection is performed with replacement, so the same rule can appear multiple times in the result.

Examples:

>>> import random
>>> import numpy as np
>>> from hgp_lib.selections import RouletteSelection
>>> from hgp_lib.rules import Literal
>>> random.seed(42); np.random.seed(42)
>>> selection = RouletteSelection()
>>> rules = [
...     Literal(value=0),
...     Literal(value=1),
...     Literal(value=2),
... ]
>>> scores = [0.1, 0.5, 0.4]
>>> selected_rules, selected_scores = selection.select(rules, scores, 2)
>>> len(selected_rules)
2
Source code in hgp_lib\selections\roulette_selection.py
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
class RouletteSelection(BaseSelection):
    """
    Fitness-proportionate selection using roulette wheel sampling.

    Each rule's probability of being selected is proportional to its fitness score.
    Higher scores receive proportionally higher selection probabilities. Negative scores
    are handled by shifting all scores to be non-negative before computing probabilities.
    Selection is performed with replacement, so the same rule can appear multiple times
    in the result.

    Examples:
        >>> import random
        >>> import numpy as np
        >>> from hgp_lib.selections import RouletteSelection
        >>> from hgp_lib.rules import Literal
        >>> random.seed(42); np.random.seed(42)
        >>> selection = RouletteSelection()
        >>> rules = [
        ...     Literal(value=0),
        ...     Literal(value=1),
        ...     Literal(value=2),
        ... ]
        >>> scores = [0.1, 0.5, 0.4]
        >>> selected_rules, selected_scores = selection.select(rules, scores, 2)
        >>> len(selected_rules)
        2
    """

    def select(
        self,
        rules: Sequence[Rule],
        scores: np.ndarray | Sequence[float],
        n_select: int,
    ) -> Tuple[List[Rule], ndarray]:
        """
        Selects `n_select` rules using roulette wheel (fitness-proportionate) selection.

        The probability of selecting each rule is proportional to its fitness score.
        If any scores are negative, all scores are shifted to be non-negative before
        computing probabilities. When all scores are equal, selection is uniform.
        Selection is performed with replacement, so the same rule may appear multiple times.

        Args:
            rules (Sequence[Rule]):
                The collection of candidate rules to select from.
            scores (np.ndarray | Sequence[float]):
                Fitness scores corresponding to each rule. Higher scores indicate better
                fitness. Must have the same length as `rules`.
            n_select (int):
                Number of rules to select.

        Returns:
            Tuple[List[Rule], ndarray]: A tuple containing:
                - List[Rule]: Copies of the selected rules. The same rule may appear multiple times.
                - ndarray: The fitness scores of the selected rules.

        Examples:
            >>> import random
            >>> import numpy as np
            >>> from hgp_lib.selections import RouletteSelection
            >>> from hgp_lib.rules import Literal
            >>> random.seed(42); np.random.seed(42)
            >>> selection = RouletteSelection()
            >>> rules = [
            ...     Literal(value=0),
            ...     Literal(value=1),
            ...     Literal(value=2),
            ... ]
            >>> scores = [0.1, 0.5, 0.4]
            >>> selected_rules, selected_scores = selection.select(rules, scores, 2)
            >>> len(selected_rules)
            2
            >>> all(isinstance(rule, Rule) for rule in selected_rules)
            True
        """
        if len(rules) == 0:
            return [], np.array([])

        scores_array = np.asarray(scores)
        original_scores = scores_array.copy()
        min_score = np.min(scores_array)
        if min_score < 0:
            scores_array = scores_array - min_score

        total = np.sum(scores_array)
        if total == 0 or np.all(scores_array == scores_array[0]):
            probabilities = np.ones(len(scores_array)) / len(scores_array)
        else:
            probabilities = scores_array / total

        selected_indices = np.random.choice(
            len(rules), size=n_select, p=probabilities, replace=True
        )

        selected_rules = [rules[i].copy() for i in selected_indices]
        selected_scores = original_scores[selected_indices]
        return selected_rules, selected_scores

select(rules, scores, n_select)

Selects n_select rules using roulette wheel (fitness-proportionate) selection.

The probability of selecting each rule is proportional to its fitness score. If any scores are negative, all scores are shifted to be non-negative before computing probabilities. When all scores are equal, selection is uniform. Selection is performed with replacement, so the same rule may appear multiple times.

Parameters:

Name Type Description Default
rules Sequence[Rule]

The collection of candidate rules to select from.

required
scores ndarray | Sequence[float]

Fitness scores corresponding to each rule. Higher scores indicate better fitness. Must have the same length as rules.

required
n_select int

Number of rules to select.

required

Returns:

Type Description
Tuple[List[Rule], ndarray]

Tuple[List[Rule], ndarray]: A tuple containing: - List[Rule]: Copies of the selected rules. The same rule may appear multiple times. - ndarray: The fitness scores of the selected rules.

Examples:

>>> import random
>>> import numpy as np
>>> from hgp_lib.selections import RouletteSelection
>>> from hgp_lib.rules import Literal
>>> random.seed(42); np.random.seed(42)
>>> selection = RouletteSelection()
>>> rules = [
...     Literal(value=0),
...     Literal(value=1),
...     Literal(value=2),
... ]
>>> scores = [0.1, 0.5, 0.4]
>>> selected_rules, selected_scores = selection.select(rules, scores, 2)
>>> len(selected_rules)
2
>>> all(isinstance(rule, Rule) for rule in selected_rules)
True
Source code in hgp_lib\selections\roulette_selection.py
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
def select(
    self,
    rules: Sequence[Rule],
    scores: np.ndarray | Sequence[float],
    n_select: int,
) -> Tuple[List[Rule], ndarray]:
    """
    Selects `n_select` rules using roulette wheel (fitness-proportionate) selection.

    The probability of selecting each rule is proportional to its fitness score.
    If any scores are negative, all scores are shifted to be non-negative before
    computing probabilities. When all scores are equal, selection is uniform.
    Selection is performed with replacement, so the same rule may appear multiple times.

    Args:
        rules (Sequence[Rule]):
            The collection of candidate rules to select from.
        scores (np.ndarray | Sequence[float]):
            Fitness scores corresponding to each rule. Higher scores indicate better
            fitness. Must have the same length as `rules`.
        n_select (int):
            Number of rules to select.

    Returns:
        Tuple[List[Rule], ndarray]: A tuple containing:
            - List[Rule]: Copies of the selected rules. The same rule may appear multiple times.
            - ndarray: The fitness scores of the selected rules.

    Examples:
        >>> import random
        >>> import numpy as np
        >>> from hgp_lib.selections import RouletteSelection
        >>> from hgp_lib.rules import Literal
        >>> random.seed(42); np.random.seed(42)
        >>> selection = RouletteSelection()
        >>> rules = [
        ...     Literal(value=0),
        ...     Literal(value=1),
        ...     Literal(value=2),
        ... ]
        >>> scores = [0.1, 0.5, 0.4]
        >>> selected_rules, selected_scores = selection.select(rules, scores, 2)
        >>> len(selected_rules)
        2
        >>> all(isinstance(rule, Rule) for rule in selected_rules)
        True
    """
    if len(rules) == 0:
        return [], np.array([])

    scores_array = np.asarray(scores)
    original_scores = scores_array.copy()
    min_score = np.min(scores_array)
    if min_score < 0:
        scores_array = scores_array - min_score

    total = np.sum(scores_array)
    if total == 0 or np.all(scores_array == scores_array[0]):
        probabilities = np.ones(len(scores_array)) / len(scores_array)
    else:
        probabilities = scores_array / total

    selected_indices = np.random.choice(
        len(rules), size=n_select, p=probabilities, replace=True
    )

    selected_rules = [rules[i].copy() for i in selected_indices]
    selected_scores = original_scores[selected_indices]
    return selected_rules, selected_scores