An Agent Based Approach to Healthcare Modelling with Big Data and AI

Supervisor: Charles Rahal, Associate Professor in Data Science and Informatics

Unit: Demographic Science Unit

Background

Agent-Based Models (ABMs) have emerged as a powerful computational tool for representing the diverse behaviours of individuals to explore complex social phenomena. Their popularity has surged in recent years, largely due to their ability to simulate distinct, flexible, and autonomous agents interacting with one another, often in pursuit of maximizing their objectives [1]. Parallel to this growth in ABM use is the rise of Large Language Models (LLMs), which are increasingly used to enhance research design. LLMs, when integrated into ABMs, provide agents with more sophisticated artificial intelligence, enabling them to make realistic decisions (see [2] for a recent review of LLMs in ABMs). LLM-powered agents have exhibited advanced capabilities in reasoning, planning, and decision-making.

ABMs are also adept at modelling real-world economic systems in general equilibrium with multi-period market dynamics. They capture these dynamics both through rule-based decision-making (e.g., [3]) and more complex LLM-driven approaches (e.g., [4], see also [5] for a broad review of ABM applications in economics). In healthcare, innovative works like AgentHospital [6] have integrated LLMs into a hospital simulation, where AI-powered doctor agents diagnose and treat thousands of patients over just a few days. Additionally, significant research has explored the incorporation of 'Big Data' into ABMs for enhanced learning and decision-making (see [7] for a recent general discussion).

This project aims to integrate open-source LLMs (such as Llama 3.1 70B) and 'Big Data' sources (including electronic health records and NHS spending data [8]) into an ABM designed to analyse healthcare demand and supply dynamics in a highly modular way. While the project's structure remains flexible, with modules subject to the interests of prospective candidates, the core objective is to create a framework capable of analysing any one of a large number of potential policy changes. For instance, reallocating public healthcare funding or adjusting resident doctor compensation during periods of high inflation could be dynamically explored.

[1] Macal, C. M., & North, M. J. (2005, December). Tutorial on agent-based modeling and simulation. In Proceedings of the Winter Simulation Conference, 2005. (pp. 14-pp). IEEE.

[2] Gao, C., Lan, X., Li, N., Yuan, Y., Ding, J., Zhou, Z., ... & Li, Y. (2024). Large language models empowered agent-based modeling and simulation: A survey and perspectives. Humanities and Social Sciences Communications, 11(1), 1-24.

[3] Lengnick, M. (2013). Agent-based macroeconomics: A baseline model. Journal of Economic Behavior & Organization, 86, 102-120.

[4] Li, N., Gao, C., Li, M., Li, Y., & Liao, Q. (2024, August). Econagent: large language model-empowered agents for simulating macroeconomic activities. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 15523-15536).

[5] Axtell, R. L., & Farmer, J. D. (2022). Agent-based modeling in economics and finance: Past, present, and future. Journal of Economic Literature, 1-101.

[6] Li, J., Wang, S., Zhang, M., Li, W., Lai, Y., Kang, X., ... & Liu, Y. (2024). Agent hospital: A simulacrum of hospital with evolvable medical agents. arXiv preprint arXiv:2405.02957.

[7] Farmer, J. D. (2024). Making Sense of Chaos: A Better Economics for a Better World. Yale University Press.

[8] Rahal, C., & Mohan, J. (2024). The role of the third sector in public health service provision: evidence from 25,338 heterogeneous procurement datasets. Journal of the Royal Statistical Society Series A: Statistics in Society, qnae092.

Methods and Training

The successful candidate will undertake the mandatory first year sequence in the Healthcare Data Science Centre for Doctoral Training (HDS CDT). They should already have familiarity with or interest in agent based modelling before enrolling, but will be encouraged to undertake additional training in this area to an advanced level during the course of the program. Auditing modules from auxiliary departments in the second through fourth years of the DPhil will be optional, but encouraged.

Candidate Background

We are looking for a candidate with a background in a computational (or at least heavily mathematical) subject such as engineering, computer science, economics, mathematics, statistics, data science, or informatics. They should be a competent Python programmer.

Application and Interview

Applicants should consider the eight references above and optionally contact the PI of the project prior to making an application (Charles Rahal). The PI will hold brief 20 minute (informal) meetings with all interested and qualified applicants to discuss their proposals. As part of all applications to the HDS CDT (and relevant to this project), applicants should cite the LCDS202425 code in their statement of purpose. Formal online interviews will be held for admission into this position; the date of the interviews is to be confirmed, but likely late January or early February 2025.