prompting.validators.reward.config#

Module Contents#

Classes#

RewardModelType

Create a collection of name/value pairs.

DefaultRewardFrameworkConfig

Reward framework default configuration.

class prompting.validators.reward.config.RewardModelType(*args, **kwds)#

Bases: enum.Enum

Create a collection of name/value pairs.

Example enumeration:

>>> class Color(Enum):
...     RED = 1
...     BLUE = 2
...     GREEN = 3

Access them by:

  • attribute access:

    >>> Color.RED
    <Color.RED: 1>
    
  • value lookup:

    >>> Color(1)
    <Color.RED: 1>
    
  • name lookup:

    >>> Color['RED']
    <Color.RED: 1>
    

Enumerations can be iterated over, and know how many members they have:

>>> len(Color)
3
>>> list(Color)
[<Color.RED: 1>, <Color.BLUE: 2>, <Color.GREEN: 3>]

Methods can be added to enumerations, and members can have their own attributes – see the documentation for details.

dpo = 'dpo_reward_model'#
rlhf = 'rlhf_reward_model'#
reciprocate = 'reciprocate_reward_model'#
dahoas = 'dahoas_reward_model'#
diversity = 'diversity_reward_model'#
prompt = 'prompt_reward_model'#
blacklist = 'blacklist_filter'#
nsfw = 'nsfw_filter'#
relevance = 'relevance_filter'#
relevance_bert = 'relevance_bert'#
relevance_mpnet = 'relevance_mpnet'#
task_validator = 'task_validator_filter'#
keyword_match = 'keyword_match_penalty'#
class prompting.validators.reward.config.DefaultRewardFrameworkConfig#

Reward framework default configuration. Note: All the weights should add up to 1.0.

dpo_model_weight: float = 0.6#
rlhf_model_weight: float = 0#
reciprocate_model_weight: float = 0.4#
dahoas_model_weight: float = 0#
prompt_model_weight: float = 0#