Hybrid CNN-LSTM-GNN Model Advances A-Share Stock Prediction: Insights from Dong and Liang

Exploring the Intersection of Deep Learning and Emerging Market Finance

academic-research
neural-networks
deep-learning
stock-prediction
a-shares

396views

Advancing Stock Market Forecasting in China's Dynamic A-Shares

The rapid evolution of artificial intelligence has transformed how researchers and investors approach the notoriously complex task of predicting stock movements. A notable contribution in this field comes from a 2025 study published in the journal Entropy, where Junhao Dong and Shi Liang from the University of Hong Kong introduce a hybrid model combining convolutional neural networks (CNN), long short-term memory (LSTM) networks, and graph neural networks (GNN). This CLGNN framework targets China's A-share market, offering a fresh perspective on multivariate time-series analysis for better stock selection strategies.

China's A-shares represent shares listed on the Shanghai and Shenzhen exchanges, denominated in renminbi and primarily accessible to domestic investors. These markets exhibit high volatility driven by retail trading, policy shifts, and economic indicators unique to an emerging economy. Traditional models often struggle here due to the non-linear relationships and inter-stock dependencies that characterize the landscape.

Understanding the Hybrid Architecture: CNN, LSTM, and GNN Explained

The CLGNN model integrates three powerful deep learning components to address different aspects of stock data. CNNs excel at extracting local patterns, such as short-term fluctuations in trading volume or price movements within a defined window. LSTM networks capture long-range temporal dependencies, making them ideal for sequential data like historical price series where trends unfold over extended periods.

Graph neural networks add a critical relational dimension. By modeling stocks as nodes in a graph and their interactions as edges, GNNs uncover hidden correlations between different equities that simpler models overlook. This multivariate approach allows the system to process not just individual stock data but the broader network of influences across the market.

The authors combine these with a novel feature selection technique called Pearson and IG weighted selection. This method evaluates 15 common metrics and identifies the most predictive five: daily return, turnover rate, relative strength index (RSI), trading volume, and forward-adjusted closing price. The hybrid filter balances statistical correlation with information gain, ensuring inputs are both relevant and non-redundant.

Methodology and Data: Rigorous Testing on CSI All Share Index

The study utilizes real daily trading data from the CSI All Share Index, encompassing a comprehensive set of A-shares. Researchers processed the latest available records from established platforms, ensuring the dataset reflects current market conditions. The model does not simply predict prices or trends in isolation; it classifies stocks based on predicted returns and directly outputs both return estimates and stock identifiers to support practical selection decisions.

Training involved careful partitioning to avoid look-ahead bias, with performance measured across multiple benchmarks including temporal convolutional networks (TCN) and Transformer models. The hybrid design mitigates individual weaknesses: CNN handles feature extraction efficiently, LSTM manages sequential memory, and GNN incorporates inter-stock dynamics through graph convolutional and temporal convolutional layers.

Key Results: Superior Performance in Return Generation

Experiments demonstrated that the CLGNN model consistently outperformed alternatives when using the selected five features. It achieved higher cumulative returns compared to standalone models and other hybrids, highlighting the value of integrating relational graph learning. Feature importance analysis confirmed the selected metrics provided robust signals across varying market regimes.

The results underscore the potential for hybrid architectures in markets like China's A-shares, where both temporal patterns and cross-sectional relationships play significant roles. By directly supporting stock selection rather than single-index forecasting, the model aligns closely with investor needs for actionable insights.

Photo by pcrm Dorego on Unsplash

Broader Implications for Finance and Emerging Markets

This research contributes to the growing body of work on deep learning in quantitative finance. It addresses gaps in prior studies, such as insufficient attention to feature engineering transparency and limited focus on developing markets. For China's A-shares specifically, the findings suggest tailored models can navigate unique characteristics like high retail participation and policy sensitivity more effectively than those optimized for mature Western markets.

Academics and practitioners alike may find value in adapting similar hybrid approaches to other emerging exchanges. The emphasis on multivariate GNN elements opens doors for incorporating additional relational data, such as supply chain links or sector correlations.

Challenges and Limitations in AI-Driven Stock Prediction

Despite promising outcomes, stock prediction remains inherently uncertain due to unpredictable events, market sentiment swings, and regulatory changes. The model, like others, operates under assumptions about data stationarity and does not account for transaction costs or liquidity constraints in live trading. Overfitting to historical patterns poses a perpetual risk, necessitating ongoing validation.

Ethical considerations also arise around the use of such tools, including potential impacts on market stability if widely adopted by algorithmic traders.

Future Directions: Refinements and Expansions

Looking ahead, researchers could enhance the CLGNN framework by incorporating sentiment analysis from news or social media, or by experimenting with attention mechanisms for dynamic graph updates. Extensions to other asset classes or international markets represent natural next steps. Integration with reinforcement learning for portfolio optimization could further bridge the gap between prediction and practical application.

As computational resources improve, scaling the model to higher-frequency data or larger universes of stocks becomes feasible, potentially unlocking even greater precision.

Impact on Academic Research and Industry Practice

Publications like this one in Entropy illustrate the vibrant intersection of engineering, computer science, and finance. They encourage interdisciplinary collaboration and provide open-access resources for the community. Industry professionals may leverage the insights to refine algorithmic trading systems, while educators can use the case to teach advanced neural network applications.

The work reinforces that thoughtful feature engineering combined with hybrid architectures can yield meaningful advantages in challenging prediction tasks.

Blurred words appear in a photograph's blurry background.

Photo by Jorick Jing on Unsplash

Stakeholder Perspectives: Researchers, Investors, and Regulators

From the researchers' viewpoint, the study validates the CLGNN approach through empirical rigor and contributes novel feature selection methodology. Investors gain a tool for systematic stock screening, though real-world deployment requires backtesting and risk management overlays. Regulators may monitor the proliferation of such models for systemic implications in increasingly digitized markets.

Overall, the paper fosters constructive dialogue on balancing innovation with market integrity.

Conclusion: A Step Forward in Intelligent Forecasting

The hybrid CNN-LSTM-GNN model proposed by Junhao Dong and Shi Liang marks a meaningful advance in A-share stock prediction. By thoughtfully combining local feature extraction, sequential modeling, and relational graph analysis, it delivers improved performance on comprehensive Chinese market data. As deep learning continues to mature, contributions like this pave the way for more sophisticated, context-aware financial tools that serve both academic inquiry and practical decision-making.

Readers interested in exploring the full study can access it directly through reputable academic platforms.

Browse by Subject

Frequently Asked Questions

🧠What is the CLGNN model proposed in the study?

The CLGNN model integrates convolutional neural networks (CNN) for local pattern detection, long short-term memory (LSTM) networks for capturing sequential dependencies, and graph neural networks (GNN) for modeling inter-stock relationships. It processes multivariate time-series data from A-shares to predict returns and support stock selection.

📈Why focus on China's A-share market?

A-shares are characterized by high volatility, significant retail investor influence, and unique policy dynamics. The model addresses gaps in research tailored to developing markets, where standard approaches optimized for mature economies often fall short.

🔍How does the feature selection method work?

The Pearson and IG weighted selection evaluates multiple metrics using correlation and information gain. It identifies the optimal five inputs: daily return, turnover rate, RSI, volume, and forward-adjusted closing price, balancing relevance and reducing redundancy.

✅What were the main experimental findings?

On the CSI All Share Index data, CLGNN outperformed benchmarks like TCN and Transformers in generating higher returns. The hybrid design effectively combined temporal, local, and relational insights for superior stock classification and prediction.

💼What are the practical applications for investors?

The model outputs predicted returns alongside stock codes, enabling direct use in selection strategies. It offers a data-driven approach to navigating the complexities of A-shares beyond simple price forecasting.

⚠️Are there limitations to the CLGNN approach?

Like all predictive models, it faces challenges from market unpredictability, potential overfitting, and the exclusion of real-world factors such as transaction costs. Continuous validation and adaptation remain essential.

📚How does this research contribute to academic fields?

It advances interdisciplinary work at the intersection of deep learning, graph theory, and financial econometrics. The open-access publication encourages further experimentation and collaboration across institutions.

🚀What future enhancements are suggested?

Potential extensions include sentiment integration from news sources, attention-based graph updates, reinforcement learning for portfolio management, and application to higher-frequency data or other emerging markets.

🔗How does GNN improve upon traditional models?

Graph neural networks explicitly model relationships between stocks as a network, revealing dependencies that isolated time-series models miss. This relational perspective enhances accuracy in interconnected markets like A-shares.

🔗Where can readers access the full paper?

The study appears in Entropy 2025, 27(8), 881, and is available via the MDPI platform for those seeking detailed methodology, results, and code-related insights.