DTDN Algorithm Integration for Manufacturing Scheduling

Download PDF Copy

By Dr Silpaja Chandrasekar, PhDReviewed by Susha Cheriyedath, M.Sc.May 1 2024

In a recently published paper in the journal Scientific Reports, researchers tackled the flexible double shop scheduling problem (FDSSP) by proposing a solution that integrated a reinforcement learning (RL) algorithm with a deep temporal difference network (DTDN) to minimize makespan.

*Study: DTDN Algorithm Integration for Manufacturing Scheduling. Image Credit: Panchenko Vladimir/Shutterstock*

They formulated FDSSP as a mathematical model, translated it into a Markov decision process (MDP), and utilized a deep neural network (DNN) to approximate the state value function. Extensive comparative experiments demonstrated the effectiveness of their approach in addressing real-world production challenges in the manufacturing industry.

Related Work

Past work has explored intelligent scheduling in manufacturing systems, focusing on the FDSSP and applying DeepRL (DRL) to address it. FDSSP merges job shop and assembly shop scheduling with limited previous research. Existing approaches include exact methods (EM) like mathematical programming and Branch and Bound and approximation methods (AM) like genetic algorithms and ant colony optimization. However, these methods have limitations in utilizing real-time data.

On the other hand, DRL offers real-time flexibility but has not been explored less in scheduling in recent years. Recent studies have applied DRL to various scheduling problems, improving efficiency and adaptability.

Integrated RL Methodology

The proposed methods for addressing the FDSSP integrate RL, specifically DTDN, with MDP. RL is a subset of machine learning (ML), which facilitates decision-making by rewarding the agent for appropriate actions within an environment without requiring extensive knowledge of states and state transition probabilities. MDPs model sequential decision problems, where states, actions, state transfer probabilities, rewards, and discount factors define the environment. The objective of RL is to enable the agent to find an optimal strategy that maximizes the expected cumulative reward function learned through continuous interaction with the environment.

The TD algorithm blends Monte Carlo and dynamic planning methods and updates the value function until convergence. Deep Learning models, particularly DNN, play a pivotal role in RL, offering vast representation space for learning complex features with reduced neuron count. The activation function introduces non-linearity to the neural network, which is crucial for learning and adapting to diverse patterns. Among common activation functions like relu, leaky-relu, and softplus, relu was selected for implementation in this paper.

Optimization functions, such as stochastic gradient descent and Adam algorithms, expedite training and mitigate hyperparameter sensitivity in neural network training. The proposed DTDN model comprises seven layers, including an input layer, five hidden layers, and an output layer. Exploration and exploitation strategies balance the agent's interaction with the environment, ensuring a trade-off between trying new actions (exploration) and exploiting known actions (exploitation) to optimize decision-making.

The DTDN model extends RL to multidimensional action spaces encountered in FDSSP. It employs a greedy strategy for behavior selection and computes rewards based on the time intervals between states. DTDN overcomes the overestimation issue encountered with Q-learning (QL) in high-dimensional action spaces by indirectly calculating action values from state values. This methodology is applied to FDSSP, which aims to optimize manufacturing system scheduling decisions.

Experiment Overview: FDSSP Evaluation

The experiment study comprises four sections to assess the proposed algorithm's efficacy for addressing the FDSSP. Initially, the performance is evaluated on standard test cases and compared against other algorithms, including small-scale instances outlined by Kacem. Following this, comparisons are made with existing dispatching rules, particularly focusing on the proposed DTDN algorithm against QL and deep deterministic policy gradient (DDPG) algorithms.

Additionally, large-scale instances of FDSSP are examined using a set of 30 problems of varying complexity, and the DTDN algorithm's performance is compared against traditional optimization methods. Lastly, a case study involving hydraulic cylinder production scheduling is presented, comparing the DTDN algorithm against particle swarm optimization (PSO), constraint programming (CP), and distributed ant colony system (DACS) algorithms proposed by previous studies. The algorithm is developed within the OpenAI Gym framework, leveraging DNN models implemented with TensorFlow. Experimental data and benchmarks are provided to facilitate evaluation.

Parameter selection is crucial for solution quality, with principles such as setting the discount factor close to 1 and utilizing an exploration strategy. The DTDN model architecture comprises seven layers, including input, hidden, and output layers, with parameters initialized randomly. Performance metrics include relative percentage deviation (RPD) and average relative percentage deviation (ARPD) to assess solution quality compared to optimal results and computational efficiency.

The algorithm's performance is evaluated across small-scale FDSSP instances, demonstrating its ability to generate optimal production schedules and outperform existing algorithms in terms of both solution quality and CPU time efficiency.

Further comparisons are made with dispatching rules, showcasing the DTDN algorithm's superior performance in obtaining better solutions across various data cases. Analysis of action space utilization highlights the effectiveness of certain heuristic behaviors and suggests potential enhancements to streamline actions.

Large-scale instances of FDSSP are investigated, confirming the DTDN algorithm's ability to yield better computational results than traditional optimization methods across different problem sizes. Finally, a case study involving hydraulic cylinder production scheduling demonstrates the algorithm's validity, with results falling within acceptable bounds and showcasing competitive performance against alternative algorithms.

Conclusion

To sum up, the team introduced the DTDN method for FDSSP, leveraging DTD RL to minimize makespan in flexible shop production environments. It demonstrated superior performance to heuristic or population intelligence algorithms through experiments, offering advantages such as real-time learning and flexibility in model adjustment. However, limitations existed, including the need for further refinement in scheduling models, algorithm procedures, and broader application domains. Future work could enhance state representation, optimize algorithm procedures, and extend its applicability to more complex scheduling problems.

Journal reference:

Wang, X., et al. (2024). A Novel Method-Based Reinforcement Learning with Deep Temporal Difference Network for Flexible Double Shop Scheduling Problem. Scientific Reports, 14:1, 9047. https://doi.org/10.1038/s41598-024-59414-8, https://www.nature.com/articles/s41598-024-59414-8

Posted in: AI Research News

Comments (0)

Written by

Silpaja Chandrasekar

Dr. Silpaja Chandrasekar has a Ph.D. in Computer Science from Anna University, Chennai. Her research expertise lies in analyzing traffic parameters under challenging environmental conditions. Additionally, she has gained valuable exposure to diverse research areas, such as detection, tracking, classification, medical image analysis, cancer cell detection, chemistry, and Hamiltonian walks.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Chandrasekar, Silpaja. (2024, May 01). DTDN Algorithm Integration for Manufacturing Scheduling. AZoAi. Retrieved on July 09, 2025 from https://www.azoai.com/news/20240501/DTDN-Algorithm-Integration-for-Manufacturing-Scheduling.aspx.
MLA
Chandrasekar, Silpaja. "DTDN Algorithm Integration for Manufacturing Scheduling". AZoAi. 09 July 2025. <https://www.azoai.com/news/20240501/DTDN-Algorithm-Integration-for-Manufacturing-Scheduling.aspx>.
Chicago
Chandrasekar, Silpaja. "DTDN Algorithm Integration for Manufacturing Scheduling". AZoAi. https://www.azoai.com/news/20240501/DTDN-Algorithm-Integration-for-Manufacturing-Scheduling.aspx. (accessed July 09, 2025).
Harvard
Chandrasekar, Silpaja. 2024. DTDN Algorithm Integration for Manufacturing Scheduling. AZoAi, viewed 09 July 2025, https://www.azoai.com/news/20240501/DTDN-Algorithm-Integration-for-Manufacturing-Scheduling.aspx.