Hi, I'm Shahriar đź‘‹

I am a joint Machine Learning and Public Policy Management Ph.D. student at the Machine Learning Department of the School of Computer Science and Heinz College of Information Systems and Public Policy at Carnegie Mellon University. I am very fortunate to be advised by Prof. George Chen (Heinz and MLD) and Prof. Jeremy Weiss (National Library of Medicine at NIH). I also work with Prof. Zachary Lipton as my MLD Mentor.

Research:

My research focuses on interpretable representation learning for temporal data with application in healthcare. I am specifically interested in developing machine learning methodology uncovering temporal representations that enhance our understanding of the evolving health status of patients, shedding light on the underlying mechanisms over time. I am also interested in the intersection of machine learning and information systems management, and how we can develop and utilize machine learning tools for high-stake decision making scenarios such as those prominent in healthcare management.

Research Interests: Representation Learning, Machine Learning for Healthcare, Multimodal Machine Learning, Decision Making, Reinforcement Learning, ML for Temporal Data, Interpretability


Pre-historic:

I earned two master's degrees from CMU, Master's of Science in Biomedical Engineering (Thesis in Neuromodulation) in 2020 and Master's of Science in Machine Learning in 2022. Prior to joining CMU, I graduated with high honours from the University of British Columbia with the Bachelor of Applied Science in Engineering Physics (with Electrical Engineering and Computer Science specialization) in 2018. At UBC I researched on Automated Pathology and worked on GPU Accelerated Photoacousitc Tomography at the Robotics and Control Laboratory under the supervision of Prof. Tim Salcudean.

Recent Research and Publications

T5-generated clinical-Language summaries for DeBERTa Report Analysis (TLDR)

T5-generated clinical-Language summaries for DeBERTa Report Analysis (TLDR)

T5-generated clinical-Language summaries for DeBERTa Report Analysis (TLDR)

T5-generated clinical-Language summaries for DeBERTa Report Analysis (TLDR) [Paper: SemEval-2024 at NAACL]

Abstract: This paper introduces novel methodologies for the Natural Language Inference for Clinical Trials (NLI4CT) task. We present TLDR (T5-generated clinical-Language summaries for DeBERTa Report Analysis) which incorporates T5-model generated premise summaries for improved entailment and contradiction analysis in clinical NLI tasks. This approach overcomes the challenges posed by small context windows and lengthy premises, leading to a substantial improvement in Macro F1 scores: a 0.184 increase over truncated premises. Our comprehensive experimental evaluation, including detailed error analysis and ablations, confirms the superiority of TLDR in achieving consistency and faithfulness in predictions against semantically altered inputs.

Temporal-Supervised Contrastive Learning: Modeling Patient Risk Progression

Temporal-Supervised Contrastive Learning: Modeling Patient Risk Progression

Temporal-Supervised Contrastive Learning: Modeling Patient Risk Progression

Temporal-Supervised Contrastive Learning: Modeling Patient Risk Progression [Paper: ML4H] [Paper: AAAI - R2HCAI Workshop]

Abstract: We consider the problem of predicting how the likelihood of an outcome of interest for a patient changes over time as we observe more of the patient’s data. To solve this problem, we propose a supervised contrastive learning framework that learns an embedding representation for each time step of a patient time series. Our framework learns the embedding space to have the following properties: (1) nearby points in the embedding space have similar predicted class probabilities, (2) adjacent time steps of the same time series map to nearby points in the embedding space, and (3) time steps with very different raw feature vectors map to far apart regions of the embedding space. To achieve property (3), we employ a nearest neighbor pairing mechanism in the raw feature space. This mechanism also serves as an alternative to “data augmentation”, a key ingredient of contrastive learning, which lacks a standard procedure that is adequately realistic for clinical tabular data, to our knowledge. We demonstrate that our approach outperforms state-of-the-art baselines in predicting mortality of septic patients (MIMIC-III dataset) and tracking progression of cognitive impairment (ADNI dataset). Our method also consistently recovers the correct synthetic dataset embedding structure across experiments, a feat not achieved by baselines. Our ablation experiments show the pivotal role of our nearest neighbor pairing.

Contrastive Learning Based Interpretable Hospital Discharge Delay Prediction

Contrastive Learning Based Interpretable Hospital Discharge Delay Prediction

Contrastive Learning Based Interpretable Hospital Discharge Delay Prediction

Contrastive Learning Based Interpretable Hospital Discharge Delay Prediction

Abstract: We addressed the significant challenge of delays in patient discharge across hospitals. Over an 11-month period, more than 63% of discharges at four UPMC hospitals were delayed, leading to costs of an estimated $6.6 million in the sampled hospital units. These delays adversely affect patient experience and health outcomes, exacerbated by issues like the lack of post-discharge patient transportation and ineffective capacity management in the health system. Throughout the CMLH fellowship, we aimed to mitigate these issues by developing a discharge delay prediction module. This initiative was divided into two phases: (1) Length of Stay Prediction: Various regression models were benchmarked using prehospital data. Predicting longer lengths of stays posed challenges, mainly due to their infrequent occurrence in the dataset. (2) Predictability Analysis: Building on initial insights, the prediction task was refined based on length of stay percentiles, identifying patients with more predictable stays versus those harder to forecast. A key innovation in this study was the application of a contrastive learning approach. This methodology significantly outperformed traditional models, including Random Forest, XGBoost, Support Vector Machines, Logistic Regression, and Fully-Connected Neural Networks. By leveraging the contrastive learning paradigm, the study offers a robust solution to predict patient discharge times, providing valuable guidance for hospital management and optimizing patient flow.

Pre-trained CLIP Encoder for Embodied Instruction Following in ALFRED

Pre-trained CLIP Encoder for Embodied Instruction Following in ALFRED

Pre-trained CLIP Encoder for Embodied Instruction Following in ALFRED

Pre-trained CLIP Encoder for Embodied Instruction Following in ALFRED [Paper]

Abstract: We introduce a method employing pre-trained CLIP encoders to enhance model generalization in the ALFRED task. In contrast to previous literature where CLIP replaces the visual encoder, we suggest using CLIP as an additional module through an auxiliary object detection objective. We validate our method on the recently proposed Episodic Transformer architecture and demonstrate that incorporating CLIP improves task performance on the unseen validation set. Additionally, our analysis results support that CLIP especially helps with leveraging object descriptions, detecting small objects, and interpreting rare words.

Automatic Brain Pathology Analysis for Traumatic Brain Injury

Automatic Brain Pathology Analysis for Traumatic Brain Injury

Automatic Brain Pathology Analysis for Traumatic Brain Injury

Automatic Brain Pathology Analysis for Traumatic Brain Injury [Paper]

Abstract: Traumatic brain injury (TBI) is one of the leading causes of death and disability worldwide. Detailed studies of the microglial response after TBI require high throughput quantification of changes in microglial count and morphology in histological sections throughout the brain. In this paper, we present a fully automated end-to-end system that is capable of assessing microglial activation in white matter regions on whole slide images of Iba1 stained sections. Our approach involves the division of the full brain slides into smaller image patches that are subsequently automatically classified into white and grey matter sections. On the patches classified as white matter, we jointly apply functional minimization methods and deep learning classification to identify Iba1-immunopositive microglia. Detected cells are then automatically traced to preserve their complex branching structure after which fractal analysis is applied to determine the activation states of the cells. The resulting system detects white matter regions with 84% accuracy, detects microglia with a performance level of 0.70 (F1 score, the harmonic mean of precision and sensitivity) and performs binary microglia morphology classification with a 70% accuracy. This automated pipeline performs these analyses at a 20-fold increase in speed when compared to a human pathologist. Moreover, we have demonstrated robustness to variations in stain intensity common for Iba1 immunostaining. A preliminary analysis was conducted that indicated that this pipeline can identify differences in microglia response due to TBI. An automated solution to microglia cell analysis can greatly increase standardized analysis of brain slides, allowing pathologists and neuroscientists to focus on characterizing the associated underlying diseases and injuries.

Publications

2024 TLDR at SemEval-2024 Task 2: T5-generated clinical-Language summaries for DeBERTa Report Analysis [Code], Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024) at Association for Computational Linguistics (NAACL) - S. Das *, V. Samuel *, S. Noroozizadeh *
2023 Temporal Supervised Contrastive Learning for Modeling Patient Risk Progression [Code], Machine Learning for Health (ML4H) 2023 - S. Noroozizadeh, J. Weiss, G. Chen
2023 Temporal Supervised Contrastive Learning with Applications to Tabular Time Series Data, AAAI 2023 R2HCAI Workshop - S. Noroozizadeh, J. Weiss, G. Chen
2022 ET tu, CLIP? Addressing Common Object Errors for Unseen Environments, CVPR 2022 Embodied-AI Workshop - Y.W. Byon *, C. Jiao *, S. Noroozizadeh *, J. Sun *, R. Vitiello *
2019 An end-to-end system for automatic characterization of iba1 immunopositive microglia in whole slide imaging, Neuroinformatics Journal - A.D. Kyriazis *, S. Noroozizadeh *, A. Refaee *, W. Choi *, L.T. Chu *, A. Bashir, W.H. Cheng, R. Zhao, D.R. Namjoshi, S.E. Salcudean, C.L. Wellington, G. Nir

Honours and Awards

Teaching Experience

  • [2023] Machine Learning for Problem Solving, CMU
  • [2023, 2022] Unstructured Data Analytics, CMU
  • [2023, 2022] PhD Microeconomics, CMU
  • [2020] Neural Signal Processing, CMU
  • [2020] Fundamentals of Computational BME, CMU
  • [2017, 2016] Algorithms and Data Structures, UBC
  • [2014] Computer Science Fundamentals, UBC

Services

  • [2023] Reviewer, ICLR
  • [2023] Reviewer, NeurIPS
  • [2023] Reviewer, AAAI

Work Experience

Sanofi

Sanofi (Cambridge, MA)

Artificial Intelligence Research Scientist Intern

Sanofi

A.I. Research Scientist Intern [2024]: • Co-led the development of the mRNA-LM model, a language model built from scratch and pretrained on millions of full-length mRNA sequences, achieving state-of-the-art performance on various mRNA prediction tasks, including structure prediction, localization, and translation efficiency.
• Designed and implemented a contrastive learning-based multimodal joint representation inspired by CLIP, which enhanced the alignment of embeddings from different mRNA regions (5' UTR, CDS, and 3' UTR) and significantly improved the downstream predictive performance of the full-length mRNA language model.
• Spearheaded the submission of a journal paper (under review) and supported the filing of a patent for the mRNA-LM project, showcasing innovative methodologies and findings.
• Contributed to the project Many-Shot In-Context Learning for Molecular Inverse Design, developing a semi-supervised learning method utilizing Large Language Models (LLMs) to improve molecular design and lead optimization. Implemented a multi-modal LLM framework for interactive molecular structure modification using text instructions.
• Collaborated on integrating Large Language Models (LLMs) into the Bayesian Optimization framework to guide optimization directions for reaction yield in drug discovery, achieving superior performance compared to human experts in selecting optimal reactions and refining design pipelines.

Microsoft

Microsoft

Software Development Engineering Intern

Microsoft

Software Development Engineering Intern [2015]: Main focus areas researched and worked on during this internship included: Windows 10 Universal Application Platform (UAP), Windows 10 NFL Application, Development of a Key Performance Indicator (KPI) System, Mocking Framework Development, Coded User Interface (UI) Automation and Build Machine Automation Development.

Philips Healthcare Research

Philips Healthcare Research

Research and Development Intern

Philips Healthcare Research

Research and Development Intern [2016]: Developed an electronic nose sensor that is capable of selectively and sensitively detect biomarkers in exhaled breath to improve the emergency diagnosis of lung infections for patients with respiratory diseases including Acute Respiratory Distress Syndrome (ARDS). I designed a standalone signal processing algorithm and application tailored for gas chromatography data, which effectively isolated the presence of octane—a critical biomarker of ARDS in exhaled breath. This application was instrumental in enhancing the accuracy and reliability of the electronic nose sensor, especially given the challenges of non-real-time data processing.

Past Research and Project Experiences

Many-Shot In-Context Learning for Molecular Inverse Design

“BERT, do you still love me?”
A painful perspective from CRF

Model-Based Reinforcement Learning with Probabilistic Ensemble and Trajectory Sampling

Semi-Supervised Support Vector Machine (S3VM)

Transcranial Focused Ultrasound Stimulation (tFUS)

A GPU-Accelerated Inversion Algorithm for Photoacoustic Tomography

Pre-clustering RNA sequences Database for Long-read de Novo Transcriptome Error Correction

Rescue-Bot: BatBot Rescuing Pets from Fire

đź“Ł News

  • [Sep 2024] Awarded Tata Consultancy Services (TCS) Presidential Fellowship
  • [May 2024] Spending Summer 2024 as an AI Research Scientist Intern at Sanofi Inc.
  • [Apr 2024] Paper accepted to SemEval-2024 at NAACL![Paper] [Code]
  • [Nov 2023] Paper accepted to Machine Learning for Health (ML4H) Conference![Paper] [Code]
  • [Sep 2023] Awarded Natural Sciences and Engineering Research Council of Canada (NSERC) CGS-D/PGS-D Fellowship
  • [May 2023] Awarded best first paper award at Heinz
  • [Feb 2023] Oral Presentation at AAAI'23 Representation Learning for Responsible Human-Centric AI [Paper] [Video]
  • [Sep 2022] Awarded Fellowship in Digital Health Innovation from Center for Machine Learning and Health (CMLH) at CMU
  • [Jun 2022] Poster at CVPR'22 Embodied AI Workshop [Paper]
  • [May 2022] Graduated from Machine Learning Master's at CMU!
  • [Sep 2021] Started a joint PhD at Heinz College and Machine Learning Department at CMU!
  • [Dec 2020] Graduated from Biomedical Engineering Master's at CMU! [Thesis]
  • [Jan 2019] Paper accepted to Neuroinformatics Journal! (AShLAW 🎉) [Paper]
  • [Sep 2018] Awarded CMU Presidential Fellowship from College of Engineering
  • [May 2018] Graduated from Engineering Physics with EECS Specilization at UBC!