Intelligence and Security Informatics Data Sets

Phishing and Other Fake Websites

The following papers on phishing, fake websites, and other related topics. See the website at for PDF links to these and other papers:

  • Zahedi, F. M., Abbasi, A., and Chen, Y. “Fake-Website Detection Tools: Identifying Design Elements that Promote Individuals’ Use and Enhance their Performance,” Journal of the Association for Information Systems, forthcoming.
  • Abbasi, A., Zahedi, F. M., Zeng, D., Chen, Y., Chen, H., and Nunamaker Jr., J. F. “Enhancing Predictive Analytics for Anti-Phishing by Exploiting Website Genre Information,” Journal of Management Information Systems, forthcoming.
  • Abbasi, A., Albrecht, C. C., Vance, A., and Hansen, J. V. “MetaFraud: A Meta-learning Framework for Detecting Financial Fraud,” MIS Quarterly, 36(4), 2012, pp. 1293-1397.
  • Abbasi, A., Zahedi, F. M., and Kaza, S. “Detecting Fake Medical Websites using Recursive Trust Labeling,” ACM Transactions on Information Systems, 30(4), 2012, no. 22.
  • Abbasi, A., Zhang, Z., Zimbra, D., Chen, H., and Nunamaker Jr., J. F. “Detecting Fake Websites: The Contribution of Statistical Learning Theory,” MIS Quarterly, 34(3), 2010, pp. 435-461 (MISQ Best Paper Award for 2010).
  • Abbasi, A. and Chen, H. “A Comparison of Tools for Detecting Fake Websites,” IEEE Computer, 42(10), 2009, pp. 78-86.
  • Abbasi, A. and Chen, H. “A Comparison of Fraud Cues and Classification Methods for Fake Escrow Website Detection,” Information Technology and Management, 10(2), 2009, pp. 83-101.



The following are a few of the recent papers published using data sets collected for the Dark Web, Hacker Web, and other AI Lab projects (prior to the construction of the AZSecure-data demonstration site).

Have you written a paper using a data set provided here?  Please send us the citation for your published work and we will include it on the website.

Dark Web Forums

  • W. Li and H. Chen. Identifying Top Sellers In Underground Economy Using Deep Learning-based Sentiment Analysis. IEEE International Conference on Intelligence and Security Informatics, 2015. [Hacker Web, NSF SES-1314631 and DUE-1303362]
  • V. Benjamin, D. Zimbra, and H. Chen, “Bridging the Virtual and Real: The Relationships between Web Content, Linkage, and Geographical Proximity of Social Movements,” Journal of the American Society for Information Science and Technology, forthcoming, 2014.  [Dark Web and GeoWeb, DTRA HDTRA-09-0058]
  • Y. Zhang, Y. Dang, and H. Chen, Research note: Examining gender emotional differences in Web forum communication.  Decision Support Systems, 55(3), 2013 [Dark Web, NSF CNS-0709338]
  • T. Fu, A. Abbasi, D. Zeng, and H. Chen, Sentimental Spidering: Leveraging Opinion Information in Focused Crawlers. ACM Transactions on Information Systems, 30(4), 2012. [Dark Web project, DTRA HDTRA-09-0058), and NSF: CNS- 0709338, CBET-0730908, IIS-1236970]
  • A. Abbasi and H. Chen, “CyberGate: A System and Design for Text Analysis of Computer Mediated
    Communications,” MIS Quarterly, Volume 32, Number 4, Pages 811-837, December 2008.

  • A. Abbasi, H. Chen, S. Thoms, and T. J. Fu, “Affect Analysis of Web Forums and Blogs using Correlation Ensembles,” IEEE Transactions on Knowledge and Data Engineering, 20(8), pp. 1168-1180, September 2008.
  • J. Qin, Y. Zhou, E. Reid, G. Lai, and H. Chen, “Analyzing Terror Campaign on the Internet: Technical Sophistication, Content Richness, and Web Interactivity,” International Journal of Human-Computer Studies, special issue on Information Security in the Knowledge Economy, v. 65, pp. 71-84, 2007.

Other Work and Papers by the Artificial Intelligence Lab

Citations to all papers by the Artificial Intelligence Lab can be found on the AI Lab website at

Here are links to other related research and projects by the Artificial Intelligence Lab:

Papers reusing and citing data hosted by the portal (by publication date)

  • Baig, Shahbaz S., and Kishor P. Wagh. "User dominance measure in online community Forum." Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB), 2016 2nd International Conference on. IEEE, 2016.

  • Alsadhan, N., and David B. Skillicorn. "Discovering structure in Islamist postings using systemic nets." Intelligence and Security Informatics (ISI), 2016 IEEE Conference on. IEEE, 2016.

  • A. J. Park, B. Beck, D. Fletche, P. Lam and H. H. Tsang, "Temporal analysis of radical dark web forum users," 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), San Francisco, CA, 2016, pp. 880-883.

  • Park, Andrew J., Ruhi Naaz Quadari, and Herbert H. Tsang. "Phishing website detection framework through web scraping and data mining." Information Technology, Electronics and Mobile Communication Conference (IEMCON), 2017 8th IEEE Annual. IEEE, 2017.

  • Hansen, Joachim. The study of keyword search in open source search engines and digital forensics tools with respect to the needs of cyber crime investigations. MS thesis. NTNU, 2017.

  • Deliu, Isuf. Extracting Cyber Threat Intelligence From Hacker Forums. MS thesis. NTNU, 2017.

  • I. Deliu, C. Leichter and K. Franke, "Extracting cyber threat intelligence from hacker forums: Support vector machines versus convolutional neural networks," 2017 IEEE International Conference on Big Data (Big Data), Boston, MA, 2017, pp. 3648-3656.

  • Etudo, Ugochukwu. Automatically Detecting the Resonance of Terrorist Movement Frames on the Web. Virginia Commonwealth University, 2017.

  • Bhattacharjee, S. D., Talukder, A., Al-Shaer, E., & Doshi, P. (2017, July). Prioritized active learning for malicious URL detection using weighted text-based features. In Intelligence and Security Informatics (ISI), 2017 IEEE International Conference on (pp. 107-112). IEEE.

  • Biswas, Baidyanath, Arunabha Mukhopadhyay, and Gaurav Gupta. "" Leadership in Action: How Top Hackers Behave" A Big-Data Approach with Text-Mining and Sentiment Analysis." Proceedings of the 51st Hawaii International Conference on System Sciences. 2018.

  • Roy, Abhishek, et al. "Game Theoretic Characterization of Collusive Behavior among Attackers." IEEE INFOCOM, 2018.

  • Pastrana, Sergio, et al. "CrimeBB: Enabling Cybercrime Research on Underground Forums at Scale." (2018).

  • Thakur, Kutub, Juan Shan, and Al-Sakib Khan Pathan. "Innovations of Phishing Defense: The Mechanism, Measurement and Defense Strategies." International Journal of Communication Networks and Information Security (IJCNIS) 10.1 (2018).

Data Infrastructure Building Blocks for ISI. A Project of the University of Arizona (NSF #ACI-1443019), Drexel University,

University of Virginia, University of Texas at Dallas, and University of Utah