Pingchuan Ma

I am a postdoctoral researcher in the Department of Computer Science and Engineering at the Hong Kong University of Science and Technology (HKUST), supervised by Prof. Shuai Wang, where I also received my Ph.D. Prior to this appointment, I was a visiting scholar at Sky Lab at UC Berkeley, hosted by Prof. Alvin Cheung. I have also worked as Chief AI Scientist at Myko, an AI startup. I received my B.Eng. from the Beijing Electronic Science and Technology Institute. My research interests lie in building intelligent, private and responsible data systems.

I am also the co-founder of CipherInsight Limited, a company I started with Prof. Shuai Wang, focusing on privacy-preserving computation with practical and developer-friendly cryptographic solutions.

I can be reached at pmaab at cse dot ust dot hk.

Publications

† means corresponding author.

Guardrail: Automated Integrity Constraint Synthesis From Noisy Data

Pingchuan Ma, Zhaoyu Wang, Zhenlan Ji, Zongjie Li, Ao Sun, and Shuai Wang

ACM SIGMOD International Conference on Management of Data (SIGMOD ’26)Code

Bridging the Gap between Causal Inference and Software Engineering

Pingchuan Ma, Zhenlan Ji, Zongjie Li, and Shuai Wang

40th IEEE/ACM International Conference on Automated Software Engineering (ASE 2025 Tutorial)

Measuring and Augmenting Large Language Models for Solving Offensive Security Challenges

Zimo Ji, Daoyuan Wu, Wenyuan Jiang, Pingchuan Ma, Zongjie Li, and Shuai Wang

ACM Conference on Computer and Communications Security (CCS 2025)

Causality-Aided Evaluation and Explanation of Large Language Model-based Code Generation

Zhenlan Ji, Pingchuan Ma^†, Zongjie Li, Zhaoyu Wang, and Shuai Wang^†

ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2025)arXiv

Reeq: Testing and Mitigating Ethically Inconsistent Suggestions of Large Language Models with Reflective Equilibrium

Pingchuan Ma, Zhaoyu Wang (equal contribution), Zongjie Li, Zhenlan Ji, Ao Sun, Juergen Rahmel, and Shuai Wang

ACM Transactions on Software Engineering and Methodology (TOSEM)arXivCode

Algorithms, Applications, and Verification of Causal Structure Learning

Pingchuan Ma

PhD Thesis

SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner

Xunguang Wang, Daoyuan Wu, Zhenlan Ji, Zongjie Li, Pingchuan Ma, Shuai Wang, Yingjiu Li, Yang Liu, Ning Liu, and Juergen Rahmel

The 34th USENIX Security Symposium (USENIX Security 2025)arXivCode

Split and Merge: Aligning Position Biases in LLM-based Evaluators

Zongjie Li, Chaozheng Wang, Pingchuan Ma, Daoyuan Wu, Shuai Wang, Cuiyun Gao, and Yang Liu

The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024 Main)arXiv

Scalable Differentiable Causal Discovery in the Presence of Latent Confounders with Skeleton Posterior

Pingchuan Ma, Rui Ding, Qiang Fu, Jiaru Zhang, Shuai Wang, Shi Han, and Dongmei Zhang

ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’24)arXivPress (in Chinese)Code

PP-CSA: Practical Privacy-Preserving Software Call Stack Analysis

Zhaoyu Wang, Pingchuan Ma^†, Huaijin Wang, and Shuai Wang^†

ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA ’24)Code

Testing Graph Database Systems via Graph-Aware Metamorphic Relations

Zeyang Zhuang, Penghui Li, Pingchuan Ma, Wei Meng, and Shuai Wang

International Conference on Very Large Data Bases (VLDB ’24)Code

Evaluating C/C++ Vulnerability Detectability of Query-Based Static Application Security Testing Tools

Zongjie Li, Zhibo Liu, Wai Kin Wong, Pingchuan Ma, and Shuai Wang

IEEE Transactions on Dependable and Secure Computing (TDSC)

On Extracting Specialized Code Abilities from Large Language Models: A Feasibility Study

Zongjie Li, Chaozheng Wang, Pingchuan Ma, Chaowei Liu, Shuai Wang, Daoyuan Wu, Cuiyun Gao, and Yang Liu

International Conference on Software Engineering (ICSE ’24)arXivPress (in Chinese)

Enabling Runtime Verification of Causal Discovery Algorithms with Automated Conditional Independence Reasoning

Pingchuan Ma, Zhenlan Ji, Peisen Yao, Shuai Wang, and Kui Ren

International Conference on Software Engineering (ICSE ’24)arXivCode

InsightPilot: An LLM-Empowered Automated Data Exploration System

Pingchuan Ma, Rui Ding, Shuai Wang, Shi Han, and Dongmei Zhang

The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP ’23 Demo Track)arXiv

Explain Any Concept: Segment Anything Meets Concept-Based Explanation

Ao Sun, Pingchuan Ma, Yuanyuan Yuan, and Shuai Wang

Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS ’23)arXivPress (in Chinese)Code

Causality-Aided Trade-off Analysis for Machine Learning Fairness

Zhenlan Ji, Pingchuan Ma^†, Shuai Wang^†, and Yanhui Li

IEEE/ACM International Conference on Automated Software Engineering (ASE ’23)arXivCode

PerfCE: Performance Debugging on Databases with Chaos Engineering-Enhanced Causality Analysis

Zhenlan Ji, Pingchuan Ma^†, and Shuai Wang

IEEE/ACM International Conference on Automated Software Engineering (ASE ’23)arXivCode

Towards Practical Federated Causal Structure Learning

Zhaoyu Wang, Pingchuan Ma^†, and Shuai Wang

European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD ’23)arXivCode

XInsight: eXplainable Data Analysis Through The Lens of Causality

Pingchuan Ma, Rui Ding, Shuai Wang, Shi Han, and Dongmei Zhang

ACM SIGMOD International Conference on Management of Data (SIGMOD ’23)arXivPress (in Chinese)Code

CC: Causality-Aware Coverage Criterion for Deep Neural Networks

Zhenlan Ji, Pingchuan Ma^†, Yuanyuan Yuan^†, and Shuai Wang

International Conference on Software Engineering (ICSE ’23)Code

sem2vec: Semantics-Aware Assembly Tracelet Embedding

Huaijin Wang, Pingchuan Ma, Shuai Wang, Qiyi Tang, Sen Nie, and Shi Wu

ACM Transactions on Software Engineering and Methodology (TOSEM)Code

Deceiving Deep Neural Networks-Based Binary Code Matching with Adversarial Programs

Wai Kin Wong, Huaijin Wang, Pingchuan Ma, Shuai Wang, Mingyue Jiang, Tsong Yueh Chen, Qiyi Tang, Sen Nie, and Shi Wu

IEEE International Conference on Software Maintenance and Evolution (ICSME ’22)Code

NoLeaks: Differentially Private Causal Discovery Under Functional Causal Model

Pingchuan Ma, Zhenlan Ji, Qi Pang, and Shuai Wang

IEEE Transactions on Information Forensics & Security (TIFS)Code

ML4S: Learning Causal Skeleton from Vicinal Graphs

Pingchuan Ma, Rui Ding, Haoyue Dai, Yuanyuan Jiang, Shuai Wang, Shi Han, and Dongmei Zhang

ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’22)Code

Unlearnable Examples: Protecting Open-Source Software from Unauthorized Neural Code Learning

Zhenlan Ji, Pingchuan Ma, and Shuai Wang

International Conference on Software Engineering and Knowledge Engineering (SEKE ’22)Code

NeuralD: Detecting Indistinguishability Violations of Oblivious RAM with Neural Distinguishers

Pingchuan Ma, Zhibo Liu, Yuanyuan Yuan, and Shuai Wang

IEEE Transactions on Information Forensics & Security (TIFS)Code

Enhancing DNN-Based Binary Code Function Search With Low-Cost Equivalence Checking

Huaijin Wang, Pingchuan Ma, Yuanyuan Yuan, Zhibo Liu, Shuai Wang, Qiyi Tang, Sen Nie, and Shi Wu

IEEE Transactions on Software Engineering (TSE)Code

Unleashing the Power of Compiler Intermediate Representation to Enhance Neural Program Embeddings

Zongjie Li, Pingchuan Ma, Huaijin Wang, Shuai Wang, Qiyi Tang, Sen Nie, and Shi Wu

International Conference on Software Engineering (ICSE ’22)

MT-Teql: Evaluating and Augmenting Neural NLIDB on Real-world Linguistic and Schema Variations

Pingchuan Ma and Shuai Wang

International Conference on Very Large Data Bases (VLDB ’22)Code

MetaInsight: Automatic Discovery of Structured Knowledge for Exploratory Data Analysis

Pingchuan Ma, Rui Ding, Shi Han, and Dongmei Zhang

ACM SIGMOD International Conference on Management of Data (SIGMOD ’21)LinkExtended VersionReference Implementation

Metamorphic Testing and Certiﬁed Mitigation of Fairness Violations in NLP Models

Pingchuan Ma, Shuai Wang, and Jin Liu

International Joint Conference on Artificial Intelligence (IJCAI ’20)Code

Award and Grant

Cybersecurity Project of the Year, Tech Fest Hong Kong Awards 2025
SENG PhD Research Excellence Award, HKUST, 2025
Hong Kong ITF TSSSU Grant (0.5M HKD), Hong Kong Innovation and Technology Commission, 2025
Hong Kong SciTech Pioneer Award Finalist (Future Innovation Scientist), 2024
Researcher Access Program, OpenAI, 2024
Overseas Research Award, HKUST Fok Ying Tung Graduate School, 2024
UGC Research Travel Grant, 2023-24 Academic Year.
Future of Life Institute Travel Support, 2024.
UGC Research Travel Grant, 2022-23 Academic Year.
SIGMOD Student Travel Award, April 2023.
AISTATS "Top Reviewer", February 2023.
Microsoft Research Asia "Star of Tomorrow" Award, December 2022.
Microsoft Research Asia "Star of Tomorrow" Award, March 2022.
NVIDIA Academic Hardware Grant, March 2022.

Experience

Visiting Scholar, Sky Lab, UC Berkeley, hosted by Prof. Alvin Cheung, 2024,4 - present
Research Intern, Microsoft Research Asia, mentored by Justin Ding, 2022.6 - 2022.11
Research Intern, Microsoft Research Asia, mentored by Justin Ding, 2021.6 - 2022.4
Research Intern, Microsoft Research Asia, mentored by Justin Ding, 2019.12 - 2020.6

Patents

Pingchuan Ma, Shuai Wang, Zhaoyu Wang. Method and System for Identifying Integrity Constraints from Noisy Data. US Provisional Patent. Filing Date: 9 Jun 2024
Pingchuan Ma, Zhaoyu Wang, Shuai Wang. Enhanced Distributed Zero-Knowledge Proof Generation. US Patent. Filing Date: 8 April 2024
Shuai Wang and Pingchuan Ma. 基于大语言模型的信息处理方法和装置. Chinese Patent. Filing Date: 6 March 2025
Pingchuan Ma, Zhaoyu Wang, Shuai Wang. New Approach for Distributed Zero-Knowledge Proof Generation for Data Analytics Workflow. US Provisional Patent. Filing Date: 1 Nov 2024
Pingchuan Ma, Zhaoyu Wang, Shuai Wang. Approach for Efficient Zero-Knowledge Causal Analysis. US Provisional Patent. Filing Date: 1 Nov 2024
Shuai Wang and Pingchuan Ma. New Approach to Detect and Fix Unethical Outputs of Large Language Models. US Provisional Patent. Filing Date: 28 April 2024

Talk

Algorithm, Application and Privacy Enhancement of Causal Structure Learning, Institute of Computing Technology of the Chinese Academy of Sciences, Dec, 2024.
Privacy-enhancing Technologies for Causal Reasoning, HKUST (Guangzhou Campus), June, 2024.
Elevating Exploratory Data Analysis in The Era of Large Language Model, Huawei, 15 Dec, 2023.
Learning Causal Skeleton from Vicinal Graphs, Microsoft Research Asia, 5 Aug, 2022.
Towards Dependable and Transparent Data Analytics Platforms, Microsoft Research Asia, 10 Mar, 2022.
Automated Fairness Testing and Beyond, Microsoft Research Asia Causality Reading Group, 6 Sept, 2021.
Metamorphic Testing and Certified Mitigation of Fairness Violations in NLP Models (in Chinese), AI Time, 15 Jan, 2021.

Academic Service

Conference Program Committee Member/Reviewer
- 2025: NeurIPS, ICML, LLM4Code, CHI, ICLR, AISTATS, ECML/PKDD, ACL Rolling Review
- 2024: NeurIPS, KDD, ECML/PKDD, ACL Rolling Review, SDM, NeurIPS Ethics Review, ACML
- 2023: KDD, AISTATS, NeurIPS Ethics Review, ECML/PKDD, FAccT, SIGMOD ARI, Queer in AI @ ACL, PETS Artifact Evaluation
- 2022: PETS Artifact Evaluation, ISSTA Artifact Evaluation, EuroSys Artifact Evaluation
Journal Reviewer
- Empirical Software Engineering, Harvard Data Science Review, Scientific Reports, Journal of Systems & Software (JSS), International Journal of Computer Vision (IJCV), IEEE Signal Processing Letters (SPL), Computational Statistics and Data Analysis

Teaching Experience

Teaching Assistant, COMP4901N: Competitive Programming in Cybersecurity (Fall 2021)
Teaching Assistant, COMP6613C: Topics in Computer Security and Privacy (Spring 2021)