基于深度神經網絡的點擊率預測模型

劉弘歷; 武森; 魏桂英; 李新; 高曉楠

doi:10.13374/j.issn2095-9389.2021.03.23.002

基于深度神經網絡的點擊率預測模型

doi: 10.13374/j.issn2095-9389.2021.03.23.002

北京科技大學經濟管理學院，北京 100083

基金項目: 國家自然科學基金資助項目（71971025）

詳細信息

通訊作者:
E-mail: weigy@manage.ustb.edu.cn

中圖分類號: TP183
計量
- 文章訪問數: 902
- HTML全文瀏覽量: 493
- PDF下載量: 146
- 被引次數: 0
出版歷程
- 收稿日期: 2021-03-23
- 網絡出版日期: 2021-08-12
- 刊出日期: 2022-11-01

Click-through rate prediction model based on a deep neural network

School of Economics and Management, University of Science and Technology Beijing, Beijing 100083, China

More Information

Corresponding author: E-mail: weigy@manage.ustb.edu.cn

摘要

摘要: 針對現有深度神經網絡點擊率預測模型在對用戶偏好建模時，難以有效且高效地處理用戶行為序列的問題，提出長短期興趣網絡（Long and short term interests network, LSTIN）模型，充分利用用戶歷史記錄上下文信息和順序信息，提升點擊率預測精準性和訓練效率。使用基于注意力機制的Transformer和激活單元結構完成用戶長、短期興趣建模，對用戶短期興趣進一步使用循環神經網絡（Recurrent neural network, RNN）、卷積神經網絡（Convolutional neural networks, CNN）進行處理，最后使用全連接神經網絡進行預測。在亞馬遜公開數據集上開展實驗，將提出的模型與基于分解機的神經網絡（DeepFM）、深度興趣網絡（Deep interest network, DIN）等點擊率預測模型對比，結果表明提出的模型實現了考慮上下文信息和順序信息的用戶歷史記錄建模，接受者操作特征曲線下面積（Area under curve, AUC）指標為85.831%，相比于基礎模型（BaseModel）提升1.154%，相比于DIN提升0.476%。且因區分用戶長、短期興趣，模型能夠在提升預測精準性的同時保障訓練效率。
- 點擊率預測 /
- 長短期興趣網絡 /
- 深度神經網絡 /
- 注意力機制 /
- 循環神經網絡 /
- 卷積神經網絡
Abstract: The click-through rate (CTR) prediction task is to estimate the probability that a user will click an item according to the features of user, item, and contexts. At present, CTR prediction has become a common and indispensable task in the field of e-commerce. Higher accuracy of CTR prediction results conduces to present more accurate and personalized results for recommendation systems and search engines to increase users’ actual CTR of items and bring more economic benefits. More researchers used a deep neural network (DNN) to solve the CTR prediction problem under the background of big data technology in recent years. However, there are a few models that can process time series data and fully consider the context information of users’ history effectively and efficiently. CTR prediction models based on a DNN learn users’ interests from their history; however, most of the existing models regard user interest, ignoring the differences between the long-term and short-term interests. This paper proposes a CTR prediction model named Long- and Short-Term Interest Network (LSTIN) to fully use the context information and order information of user history records. This use will help improve the accuracy and training efficiency of the CTR prediction model. Based on the attention mechanism, the transformer and activation unit structure are used to model long-term and short-term user interests. The latter is processed using the recurrent and convolutional neural networks further. Eventually, a fully-connected neural network is applied for prediction. Different from DeepFM and Deep Interest Network (DIN) in experiments on an Amazon public dataset, LSTIN achieves modeling with context and order information of user history. The AUC of LSTIN is 85.831%, which is 1.154% higher than that of BaseModel and 0.476% higher than that of DIN. Besides, LSTIN achieves distinguishing the long-term and short-term interests of users, which improves the performance and maintains the training efficiency of the CTR prediction model.
- CTR prediction /
- long and short term interest network /
- deep neural network /
- attention mechanism /
- RNN /
- CNN

HTML全文

圖 1 LSTIN模型結構

Figure 1. Structure of an LSTIN

下載: 全尺寸圖片幻燈片

圖 2 編碼器結構

Figure 2. Structure of an encoder

下載: 全尺寸圖片幻燈片

圖 3 激活單元注意力機制原理示意圖

Figure 3. Attention mechanism in an activation unit

下載: 全尺寸圖片幻燈片

圖 4 算法AUC對比

Figure 4. AUC of various algorithms

下載: 全尺寸圖片幻燈片

圖 5 部分模型訓練時間對比

Figure 5. Training time comparisons

下載: 全尺寸圖片幻燈片

表 1 數據集統計信息

Table 1. Statistical information of a dataset

Data set	Number of users	Number of commodities	Number of categories	Number of samples
Amazon (Electronics)	192403	63001	801	1689188

下載: 導出CSV

表 2 算法性能對比

Table 2. Algorithm performance

Category	Name	AUC/%	Best AUC/%	RP(BaseModel)/%	RP (DIN)/%
Existing models	BaseModel	84.852	84.946	0	?0.670
	DeepFM	85.012	85.095	0.189	?0.482
	DIN	85.424	85.468	0.674	0
Modesl of this paper	LIN	85.581	85.669	0.859	0.184
	LIN_c	84.803	84.911	?0.058	?0.727
	LIN_r	85.796	85.841	1.113	0.435
	LSTIN_c	85.781	85.834	1.095	0.418
	LSTIN_r	85.831	85.943	1.154	0.476

下載: 導出CSV

久色视频

參考文獻(26)

[1]	Shen X L, Li Z J, He C H. Hybrid recommendation algorithm based on rating filling and trust information. J Comput Appl, 2020, 40(10): 2789 沈學利, 李子健, 赫辰皓. 基于評分填充與信任信息的混合推薦算法. 計算機應用, 2020, 40(10):2789
[2]	Song H Q, Du S Y, Zhou Y C, et al. Big data intelligent platform and application analysis for oil and gas resource development. Chin J Eng, 2021, 43(2): 179 宋洪慶, 都書一, 周園春, 等. 油氣資源開發的大數據智能平臺及應用分析. 工程科學學報, 2021, 43(2):179
[3]	Tao Z L, Wang X, He X N, et al. HoAFM: A high-order attentive factorization machine for CTR prediction. Inf Process Manag, 2019, 57(6): 102076
[4]	Zhou A Y, Zhou M Q, Gong X Q. Computational advertising: A data-centric comprehensive web application. Chin J Comput, 2011, 34(10): 1805 doi: 10.3724/SP.J.1016.2011.01805 周傲英, 周敏奇, 宮學慶. 計算廣告: 以數據為核心的Web綜合應用. 計算機學報, 2011, 34(10):1805 doi: 10.3724/SP.J.1016.2011.01805
[5]	Liu M J, Zeng G C, Yue W, et al. Review on click-through rate prediction models for display advertising. Comput Sci, 2019, 46(7): 38 doi: 10.11896/j.issn.1002-137X.2019.07.006 劉夢娟, 曾貴川, 岳威, 等. 面向展示廣告的點擊率預測模型綜述. 計算機科學, 2019, 46(7):38 doi: 10.11896/j.issn.1002-137X.2019.07.006
[6]	Richardson M, Dominowska E, Ragno R. Predicting clicks: Estimating the click through rate for new ADs // Proceedings of the 16th International Conference on World Wide Web. Alberta, 2007: 521
[7]	Chen J X, Sun B G, Li H, et al. Deep CTR prediction in display advertising // Proceedings of the 24th ACM international conference on Multimedia. Amsterdam, 2016: 811
[8]	Rendle S. Factorization Machines // 2010 IEEE International Conference on Data Mining. Berlin, 2010: 995
[9]	Zhang W N, Du T M, Wang J. Deep learning over multi-field categorical data: A case study on user response prediction // Proceedings of European Conference on Information Retrieval. Padua, 2016: 45
[10]	Qu Y R, Cai H, Ren K, et al. Product-based neural networks for user response prediction // 2016 IEEE 16th International Conference on Data Mining. Barcelona, 2016: 1149
[11]	Cheng H T, Koc L, Harmsen J, et al. Wide & Deep learning for recommender systems // Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. Boston, 2016: 7
[12]	Guo H F, Tang R M, Ye Y M, et al. DeepFM: A factorization-machine based neural network for CTR prediction // Twenty-Sixth International Joint Conference on Artificial Intelligence. Melbourne, 2017: 1725
[13]	Zhou G R, Song C R, Zhu X Q, et al. Deep interest network for click-through rate prediction // Proceedings of KDD International Conference on Knowledge Discovery & Data Mining. London, 2018: 1059
[14]	Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computat, 1997, 9(8): 1735 doi: 10.1162/neco.1997.9.8.1735
[15]	Cheng Y, Yao L B, Zhang G H, et al. Text sentiment orientation analysis of multi-channels CNN and BiGRU based on attention mechanism. J Comput Res Dev, 2020, 57(12): 2583 doi: 10.7544/issn1000-1239.2020.20190854 程艷, 堯磊波, 張光河, 等. 基于注意力機制的多通道CNN和BiGRU的文本情感傾向性分析. 計算機研究與發展, 2020, 57(12):2583 doi: 10.7544/issn1000-1239.2020.20190854
[16]	Cho K, Merri?nboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation // Proceedings of the Conference on Empirical Methods on Natural Language Processing. Doha, 2014: 1724
[17]	Dai J H, Deng Y B. Extracting emotion-cause pairs based on emotional dilation gated CNN. Data Anal Knowl Discov, 2020, 4(8): 98 代建華, 鄧育彬. 基于情感膨脹門控CNN的情感-原因對提取. 數據分析與知識發現, 2020, 4(8):98
[18]	Kalchbrenner N, Grefenstette E, Blunsom P. A convolutional neural network for modelling sentences // Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Baltimore, 2014: 655
[19]	Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need // Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, 2017: 6000
[20]	Jiang D, Xu R B, Xu X, et al. Multi-view feature transfer for click-through rate prediction. Inf Sci, 2021, 546: 961 doi: 10.1016/j.ins.2020.09.005
[21]	Yu C M, Feng B L, An L. Sentiment analysis in cross-domain environment with deep representative learning. Data Anal Knowl Discov, 2017, 1(7): 73 余傳明, 馮博琳, 安璐. 基于深度表示學習的跨領域情感分析. 數據分析與知識發現, 2017, 1(7):73
[22]	Zhang Y F, Lu Z Q. Remaining useful life prediction based on an integrated neural network. Chin J Eng, 2020, 42(10): 1372 張永峰, 陸志強. 基于集成神經網絡的剩余壽命預測. 工程科學學報, 2020, 42(10):1372
[23]	Kim Y. Convolutional neural networks for sentence classification // Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha, 2014: 1746
[24]	Zhang Y, Wallace B. A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. Comput Sci, https://arxiv.org/pdf/1510.03820.pdf
[25]	Kingma D P, Ba J. Adam: A method for stochastic optimization // International Conference on Learning Representations. San Diego, 2015
[26]	Li Y J, Guo H X, Li Y N, et al. A boosting based ensemble learning algorithm in imbalanced data classification. Syst Eng Theory Pract, 2016, 36(1): 189 李詒靖, 郭海湘, 李亞楠, 等. 一種基于Boosting的集成學習算法在不均衡數據中的分類. 系統工程理論與實踐, 2016, 36(1):189