Dr. Long's research purposefully includes novel statistical and ML/AI research and impactful biomedical research, each of which reinforces the other. Its thrust is to develop robust statistical and machine learning methods for advancing precision medicine. Specifically, he has developed methods for analysis of big health data (-omics, EHRs, and mHealth data), missing data, causal inference, data privacy, algorithmic fairness, Bayesian methods and clinical trials. Dr. Longâs methodological research has been supported by the National Institutes of Health (NIH), the Patient-Centered Outcomes Research Institute (PCORI) the National Science Foundation (NSF), and the Advanced Research Projects Agency for Health (ARPA-H).
Dr. Long has directed the Statistical and Data Coordinating Center for national research networks and large-scale multi-site clinical studiesâsupervising a team of database administrators and programmers, application developers and statistical analysts. HeÂcurrently co-directs (with Dr. Nicola Mason at Penn Vet) the Coordinating Center for the Premedical Cancer Immunotherapy Network for Canine Trials (PRECINCT), part of NCIâs Cancer Moonshot Initiative.
Dr. Long is the founding Director ofÂthe Center for Cancer Data Science, and Associate Director for Cancer Informatics of the Penn Institute for Biomedical Informatics. He also directsÂthe Biostatistics and Bioinformatics Core in the Abramson Cancer Center at the University of Pennsylvania.
Dr. Long is an elected fellow of the American Association for the Advancement of Science (AAAS), elected fellow of the American Statistical Association (ASA), and elected member of the International Statistical Institute (ISI).
Selected Publications
Orcutt, X., Chen, K., Mamtani, R., Long, Q.# and Parikh, R.B.# : Evaluating Generalizability of Results from Landmark Randomized Controlled Trials in Oncology To Real-World Patients using Machine Learning-Based Emulated Trials. Nature Medicine Page: https://doi.org/10.1038/s41591-024-03352-5, 2025 Notes: #joint senior/corresponding authors. Li, X., Ruan, F., Wang, H., Long, Q.# and Su, W.J.#: A Statistical Framework of Watermarks for Large Language Models: Pivot, Detection Efficiency and Optimal Rules Annals of Statistics Page: in press, 2025 Notes: #joint senior/corresponding authors. Chang C, Jang A, Manatunga A, Taylor A.T., Long, Q : A Bayesian Latent Class Model to Predict Kidney Obstruction Based on Renography and Expert Ratings in the Absence of Gold Standard. Journal of the American Statistical Association 115(532): 1645- 1663, 2020. Wu, Y., Keoliya, M., Chen, K., Velingker, N., Li, Z., Getzen, E., Long, Q., Naik, M., Parikh, R. and Wong, E. : DISCRET: Synthesizing Faithful Explanations For Treatment Effect Estimation. The Forty-First International Conference on Machine Learning (ICML 2024), Spotlight (3.5% acceptance rate). 2024. Zhou, Z., Ataee Tarzanagh, D., Hou, B., Tong, B., Xu, J., Feng, Y., Long, Q.# and Shen, L.#: Fair Canonical Correlation Analysis. 2023 Conference on Neural Information Processing Systems (NeurIPS 2023) 2023 Notes: #joint senior/corresponding authors. Getzen, E.J., Ungar, L., Mowery, D., Jiang, X., and Long, Q.: Mining for Equitable Health: Assessing the Impact of Missing Data in Electronic Health Records. Journal of Biomedical Informatics 139: 104269, 2023. Zhang Y., Long, Q. : Assessing Fairness in the Presence of Missing Data. 2021 Conference on Neural Information Processing Systems (NeurIPS 2021) 34: 16007-16019, 2021. Fang, C., He, H., Long, Q., Su, W.: Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse in Imbalanced Training. Proceedings of the National Academy of Sciences (PNAS) 118(43): e2103091118, 2021. Chang, C., Deng, Y., Jiang, X., Long, Q.: Multiple Imputation for Analysis of Incomplete Data in Distributed Health Data Networks. Nature Communications 11(1): 5467, 2020. Bu, Z.,ÂDong, J., Long, Q., Su, W.: Deep Learning with Gaussian Differential Privacy. Harvard Data Science Review 2(3): 1-48, 2020. Zheng, Q., Dong, J., Long, Q., Su, W.: Sharp Composition Bounds for Gaussian Differential Privacy via Edgeworth Expansion. Proceedings of the 37th International Conference on Machine Learning (ICML 2020) 119: 11420-11435, 2020. Zhao, Y., Chang, C., and Long, Q.: Knowledge-guided statistical learning methods for analysis of high-dimensional -omics data in precision oncology. JCO Precision Oncology 3: 1-9, 2019. Min EJ, Safo SE, Long Q: Penalized co-inertia analysis with applications to -omics data. Bioinformatics 35(6): 1018-1025, 2019 Notes: doi: 10.1093/bioinformatics/bty726. Li Z, Roberts K, Jiang X, Long Q: Distributed Learning from Multiple EHR Databases: Contextual Embedding Models for Medical Events. Journal of Biomedical Informatics 92: 103138, 2019 Notes: doi: 10.1016/j.jbi.2019.103138. Epub 2019 Feb 27. Zhao, Y.*, Chung, M., Johnson, B.A., Moreno, C.S., and Long, Q.: Hierarchical feature selection incorporating known and novel biological information: Identifying genomic features related to prostate cancer recurrence. Journal of the American Statistical Association 111(516): 1427-1439, 2016 Notes: *An earlier version won Yize Zhao the David P. Byar Travel Award from American Statistical Associationâs Biometrics Section 2014. Chang C, Kundu S, Long Q: Scalable Bayesian variable selection for structured high-dimensional data. Biometrics 74(4): 1372-1382, 2018 Notes: doi: 10.1111/biom.12882. Epub 2018 May 8. Safo, S.E., Li, S., and Long, Q.: Integrative analysis of transcriptomic and metabolomic data via sparse canonical correlation analysis with incorporation of biological information. Biometrics 74(1): 300-312, 2018. Long, Q., Xu, J., Osunkoya, A.O., Sannigrahi, S., Johnson, B.A., Zhou, W., Gillespie, T., Park, J.Y., Nam, R.K., Sugar, L., Stanimirovic, A., Seth, A.K., Petros, J.A., and Moreno, C.S.: Global transcriptome analysis of formalin-fixed prostate cancer specimens identifies biomarkers of disease recurrence. Cancer Research 74(12): 3228-3237, 2014. Long, Q., Little, R.J., and Lin, X.: Causal inference in hybrid intervention trials involving treatment choice. Journal of the American Statistical Association 103(482): 474-484, 2008.
back to top
Last updated: 01/23/2025
The Trustees of the University of Pennsylvania