DisjunctiveGraphJspEnv¶

class graph_jsp_env.disjunctive_graph_jsp_env.DisjunctiveGraphJspEnv(jps_instance: ndarray[Any, dtype[ScalarType]] = None, *, reward_function='trivial', custom_reward_function: Callable = None, reward_function_parameters: Dict = None, normalize_observation_space: bool = True, flat_observation_space: bool = True, dtype: str = 'float32', action_mode: str = 'task', env_transform: str = None, perform_left_shift_if_possible: bool = True, c_map: str = 'rainbow', dummy_task_color='tab:gray', default_visualisations: List[str] = None, visualizer_kwargs: Dict = None, verbose: int = 0)[source]¶

Bases: Env

Custom Environment for the Job Shop Problem (jsp) that follows gymnasium interface.

This environment is inspired by the

The disjunctive graph machine representation of the job shop scheduling problem

by Jacek Błażewicz 2000

https://www.sciencedirect.com/science/article/pii/S0377221799004865

and

Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning

by Zhang, Cong, et al. 2020

https://proceedings.neurips.cc/paper/2020/file/11958dfee29b6709f48a9ba0387a2431-Paper.pdf

https://github.com/zcaicaros/L2D

This environment does not explicitly include disjunctive edges, like specified by Jacek Błażewicz, only conjunctive edges. Additional information is saved in the edges and nodes, such that one could construct the disjunctive edges, so the is no loss in information. Moreover, this environment does not implement the graph matrix datastructure by Jacek Błażewicz, since in provides no benefits in chosen the reinforcement learning stetting (for more details have a look at the master thesis).

This environment is more similar to the Zhang, Cong, et al. implementation. Zhang, Cong, et al. seems to store exclusively time-information exclusively inside nodes (see Figure 2: Example of state transition) and no additional information inside the edges (like weights in the representation of Jacek Błażewicz). However, I had a rough time in understanding the code of Zhang, Cong, et al. 2020, so I might be wrong about that.

The DisjunctiveGraphJssEnv uses the networkx library for graph structure and graph visualization. It is highly configurable and offers a lot of rendering options.

get_action_history() → List[int][source]¶: returns the action history of the current episode. :return: list of actions

get_makespan() → float[source]¶: returns the makespan in the terminal state. :return:

get_reward(state: ndarray[Any, dtype[ScalarType]], done: bool, info: Dict, makespan_this_step: float)[source]¶

get_state() → ndarray[Any, dtype[ScalarType]][source]¶

returns the state of the environment as numpy array.

Returns:: the state of the environment as numpy array.

greedy_machine_utilization_rollout() -> (<class 'int'>, dict[str, list])[source]¶

is_terminal() → bool[source]¶: checks if the current state is terminal. :return: bool flag. Flase -> not terminal, True -> terminal

load_instance(jsp_instance: ndarray[Any, dtype[ScalarType]], *, reward_function_parameters: Dict = None) → None[source]¶

This loads a jsp instance, sets up the corresponding graph and sets the attributes accordingly.

Parameters:

jsp_instance – a jsp instance as numpy array
reward_function_parameters – if specified, the reward functions params will be updated.

Returns:

None

metadata: dict[str, Any] = {'render.modes': ['human', 'rgb_array', 'console']}¶

network_as_dataframe() → DataFrame[source]¶

returns the current state of the environment in a format that is supported by Plotly gant charts. (https://plotly.com/python/gantt/)

Returns:: the current state as pandas dataframe

random_rollout() -> (<class 'float'>, dict[str, list])[source]¶: performs a random rollout from the current state until a terminal state is reached.

render(mode='human', show: List[str] = None, **render_kwargs) → None | ndarray[Any, dtype[ScalarType]][source]¶

renders the enviorment.

Parameters:

mode –
valid options: “human”, “rgb_array”, “console”

”human” (default)

render the visualisation specified in :param show: If :param show: is None DisjunctiveGraphJssEnv.default_visualisations will be used.

”rgb_array”

returns rgb-arrays of the ‘window’ visualisation specified in DisjunctiveGraphJssEnv.default_visualisations

”console”

prints the ‘console’ visualisations specified in DisjunctiveGraphJssEnv.default_visualisations to the console
show – subset of the available visualisations [“gantt_window”, “gantt_console”, “graph_window”, “graph_console”] as list of strings.
render_kwargs – additional keword arguments for the jss_graph_env.DisjunctiveGraphJspVisualizer.render_rgb_array-method.

Returns:

numpy array if mode=”rgb_array” else None

reset(**kwargs) → tuple[numpy.ndarray[typing.Any, numpy.dtype[+ScalarType]], typing.Dict[str, typing.Any]][source]¶: resets the environment and returns the initial state. :param **kwargs: additional keyword arguments for the generic gymnasium reset method. :return:

step(action: int) → tuple[numpy.ndarray[typing.Any, numpy.dtype[+ScalarType]], typing.SupportsFloat, bool, bool, typing.Dict[str, typing.Any]][source]¶

perform an action on the environment. Not valid actions will have no effect.

Parameters:: action – an action
Returns:: state, reward, done-flag, info-dict

valid_action_list() → list[int][source]¶: Returns a list of valid actions that can be taken in the current state of the environment.

valid_action_mask(action_mode: str = None) → List[bool][source]¶

returns that indicates which action in the action space is valid (or will have an effect on the environment) and which one is not.

Parameters:: action_mode – Specifies weather the action-argument of the DisjunctiveGraphJssEnv.step-method corresponds to a job or a task (or node in the graph representation)
Returns:: list of boolean in the same shape as the action-space.

valid_actions() → Set[int][source]¶

Returns the set of valid actions that can be taken in the current state of the environment. The set contains the values one can pass to the step-function. The values depend on the action_mode and the current state of the environment. The set is empty if there are no valid actions.

Returns:: set of valid actions