Welcome!

    I am a fifth year PhD student in the computer science department at Stanford University, working with Matei Zaharia and James Zou.
    I am broadly interested in machine learning and data systems. More recently, I have been particularly interested in understanding and leveraging foundation models in real-world applications, through the lens of data-driven insights.

    Email: lingjiao [at] [stanford] [dot] [edu]
    Twitter        Github        GoogleScholr
    I am on the 2023-2024 job market now.

Technical Reports and Preprints
  • Are More LLM Calls All You Need? Towards Scaling Laws of Compound Inference Systems.
    Lingjiao Chen, Jared Quincy Davis, Boris Hanin, Peter Bailis, Ion Stoica, Matei Zaharia, James Zou.
    Arxiv, 2024.
    [PDF] [Code and Data]
  • Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews.
    Weixin Liang, Zachary Izzo, Yaohui Zhang, Haley Lepp, Hancheng Cao, Xuandong Zhao, Lingjiao Chen, Haotian Ye, Sheng Liu, Zhi Huang, Daniel McFarland, James Zou.
    Arxiv, 2024.
    [PDF] [Code and Data]
  • FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance.
    Lingjiao Chen, Matei Zaharia, James Zou.
    Arxiv, 2023.
    [PDF] [Code and Data]
  • Data Acquisition: A New Frontier in Data-centric AI.
    Lingjiao Chen, Bilge Acun, Newsha Ardalani, Yifan Sun, Feiyang Kang, Hanrui Lyu, Yongchan Kwon, Ruoxi Jia, Carole-Jean Wu, Matei Zaharia, James Zou.
    Arxiv, 2023.
    [PDF] [Code and Data]
  • Solon: Communication-efficient Byzantine-resilient Distributed Training via Redundant Gradients.
    Lingjiao Chen, Leshang Chen, Hongyi Wang, Susan Davidson, Edgar Dobriban.
    Arxiv, 2021.
    [PDF]
Conference and Workshop Publications
  • How is ChatGPT's behavior changing over time?
    Lingjiao Chen, Matei Zaharia, James Zou.
    Harvard Data Science Review, 2024.
    [PDF] [Code and Data]
  • Analyzing ChatGPT’s Behavior Shifts Over Time.
    Lingjiao Chen, Matei Zaharia, James Zou.
    NeurIPS Conference on Neural Information Processing Systems R0-FoMo Workshop, 2023.
    [PDF]
  • DataPerf: Benchmarks for Data-centric AI Development.
    The DataPerf team.
    NeurIPS Conference on Neural Information Processing Systems, 2023.
    [PDF] [Website]
  • HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions.
    Lingjiao Chen, Zhihua Jin, Sabri Eyuboglu, Christopher Re, Matei Zaharia, James Zou.
    NeurIPS Conference on Neural Information Processing Systems, 2022.
    [PDF] [Website]
  • Estimating and Explaining Model Performance When Both Covariates and Labels Shift.
    Lingjiao Chen, Matei Zaharia, James Zou.
    NeurIPS Conference on Neural Information Processing Systems, 2022.
    [PDF]
  • Efficient Online ML API Selection for Multi-Label Classification Tasks.
    Lingjiao Chen, Matei Zaharia, James Zou.
    ICML International Conference on Machine Learning, 2022.
    [PDF]
  • How Did the Model Change? Efficiently Assessing Machine Learning API Shifts.
    Lingjiao Chen, Matei Zaharia, James Zou.
    ICLR International Conference on Learning Representations, 2022.
    [PDF]
  • SEAL: Interactive Tool for Semantic Error Analysis and Labeling.
    Nazneen Rajani, Weixin Liang, Lingjiao Chen, Meg Mitchell, James Zou.
    EMNLP Conference on Empirical Methods in Natural Language Processing, 2022.
    [PDF]
  • ML API Shift Assessments: Change is Coming!
    Lingjiao Chen, Matei Zaharia, James Zou.
    ICML International Conference on Machine Learning SRML Workshop, 2021 (Oral).
    [PDF]
  • Have the Cake and Eat It Too? Higher Accuracy and Less Expense when Using Multi-label ML APIs Online.
    Lingjiao Chen, Matei Zaharia, James Zou.
    ICML International Conference on Machine Learning DMMLSYS Workshop, 2021.
    [PDF]
  • SOLON: Communication-efficient Byzantine-resilient Distributed Training via Redundant Gradients.
    Lingjiao Chen, Leshang Chen, Hongyi Wang, Susan Davidson, Edgar Dobriban.
    ISCA International Symposium on Computer Architecture SPSL Workshop, 2021.
    [PDF]
  • FrugalML: How to Use ML Prediction APIs More Accurately and Cheaply.
    Lingjiao Chen, Matei Zaharia, James Zou.
    NeurIPS Conference on Neural Information Processing Systems, 2020 (Oral).
    [PDF]
  • To Call or not to Call? Using ML Prediction APIs more Accurately and Economically.
    Lingjiao Chen, Matei Zaharia, James Zou.
    ICML International Conference on Machine Learning EcoPaDL Workshop, 2020.
    [PDF]
  • Towards Model-based Pricing for Machine Learning in a Data Marketplace.
    Lingjiao Chen, Paraschos Koutris, Arun Kumar.
    ACM SIGMOD International Conference on Management of Data, 2019.
    [PDF] [Technical Report]
  • Demonstration of Nimbus: Model-based Pricing for Machine Learning in a Data Marketplace.
    Lingjiao Chen, Hongyi Wang, Leshang Chen, Paraschos Koutris, Arun Kumar.
    ACM SIGMOD International Conference on Management of Data, 2019.
    [PDF] [Code and Data]
  • Enabling and Optimizing Non-linear Feature Interactions in Factorized Linear Algebra.
    Side Li, Lingjiao Chen, Arun Kumar.
    ACM SIGMOD International Conference on Management of Data, 2019.
    [PDF] [Code and Data]
  • Tuple-oriented Compression for Large-scale Mini-batch Stochastic Gradient Descent.
    Fengan Li, Lingjiao Chen, Arun Kumar, Jeffrey F. Naughton, Jignesh M. Patel, Xi Wu.
    ACM SIGMOD International Conference on Management of Data, 2019.
    [PDF] [Technical Report] [Code and Data]
  • The Effect of Network Width on the Performance of Large-batch Training.
    Lingjiao Chen, Hongyi Wang, Jinman Zhao, Dimitris Papailiopoulos, Paraschos Koutris.
    NIPS Conference on Neural Information Processing Systems, 2018.
    [PDF] [Technical Report]
  • DRACO: Byzantine-resilient Distributed Training via Redundant Gradients.
    Lingjiao Chen, Hongyi Wang, Zachary Charles, Dimitris Papailiopoulos.
    ICML International Conference on Machine Learning, 2018.
    [PDF] [Technical Report]
  • Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training.
    Xi Wu, Uyeong Jang, Jiefeng Chen, Lingjiao Chen, Somesh Jha.
    ICML International Conference on Machine Learning, 2018.
    [PDF] [Technical Report]
  • Draco: Robust Distributed Training against Adversaries.
    Lingjiao Chen, Hongyi Wang, Dimitris Papailiopoulos.
    SysML, 2018.
    [PDF]
  • Accelerating Linear Algebra over Normalized Data.
    Lingjiao Chen.
    ACM SIGMOD International Conference on Management of Data Student Research Competition, 2017.
    [PDF] Second Runner-up Award Winner
  • Model-based Pricing: Do Not Pay for More than What You Learn!
    Lingjiao Chen, Paraschos Koutris, Arun Kumar.
    ACM SIGMOD International Conference on Management of Data DEEM Workshop, 2017.
    [PDF]
Journal Publications
  • Towards Linear Algebra over Normalized Data.
    Lingjiao Chen, Arun Kumar, Jeffrey F. Naughton, Jignesh M. Patel.
    Proceedings of the VLDB Endowment Volume 10 Issue 11, 2017.
    [PDF] [Technical Report] [Code and Data]
  • Distributed User-centric Scheduling for Visible Light Communication Networks.
    Lingjiao Chen, Jiaheng Wang, Jiantao Zhou, Derrick Wing Kwan Ng, Robert Schober, and Chunming Zhao.
    Optics Express Volume 24 Issue 14, 2016.
    [PDF]