Hi! I am Sihat Afnan, a PhD Researcher in the CSE Department of University of California, Irvine. I graduated from the Department of Computer Science and Engineering at Bangladesh University of Engineering and Technology (BUET).
I enjoy outdoor activities (e.g. jogging, travelling), reading non-fiction, Op-Eds, and listening to qawali music.
My research investigates security and privacy issues in the perception modules of emerging systems, including AR/VR headsets, autonomous vehicles, and embodied AI such as humanoid robotics. I study how the cameras, depth sensors, microphones, and motion data these systems rely on can be manipulated by adversaries or leak sensitive information about their users, and I build measurement frameworks and defenses that expose these risks before deployment.
PhD in Computer Science (2024 - Present)
University of California, Irvine
Bachelor of Science in Computer Science (2018 - 2023)
Bangladesh University of Engineering and Technology
Conducting research on security and privacy of perception modules in emerging systems, including AR/VR, autonomous vehicles, and embodied AI.
Assisted in teaching CS 130 (Introduction to Computer Security), ICS 33 (Intermediate Programming with Python), and ICS H32 (Python Programming with Libraries - Accelerated).
Taught courses on Computer Architecture, Microprocessor, Operating Systems and Discrete Mathematics
Deployed Mathematical models, built machine learning tools and devised financial engineering solutions
Showed that human-to-humanoid motion retargeting, which projects operator demonstrations onto a shared robot skeleton and discards body shape, fails to anonymize the operator: while it normalizes body proportions, it preserves movement dynamics shaped by the operator's physiology. Demonstrated that retargeted trajectories support accurate gender classification, operator reidentification, and age/height regression even for unseen operators, with signals that are task-invariant and consistent across retargeting implementations. Introduced UNVEIL, a skeleton-aware spatiotemporal graph network to measure and interpret this effect, raising a privacy concern for the robotics community: as teleoperation datasets are increasingly shared and scaled, retargeted trajectories can act as a biometric fingerprint exposing sensitive operator attributes.
Project Page: project-unveil.github.io
Status: Under review at NeurIPS 2026
Investigated the security of XR spatial understanding pipelines by designing the first on-device acoustic attack that uses only a headset's built-in speakers to subtly manipulate 3D scene reconstruction. Modeled how injected acoustic interference perturbs camera odometry, RGB imaging, and depth sensing, and developed an optimization framework that generates perturbations causing controlled geometric distortions in the spatial map. Demonstrated impactful effects including object addition/removal, surface misclassification, and degraded user task performance across Meta Quest 3S, Apple iPad, and ARIA glasses, with real-world experiments on Quest 3S showing corruption in over 91% of spatial maps.
Status: Under review at MobiCom 2026
Developing a high-fidelity VR simulation framework using CARLA to study how human drivers perceive and react to autonomous vehicle misbehavior caused by sensor-level perception attacks such as stop sign manipulation, lane detection failures, and phantom obstacles. The project integrates a realistic AV stack, including sensor simulation, machine-learning based perception, planning, and control, together with real-time eye-tracking and behavioral logging. This work addresses key challenges in synchronizing multi-modal sensor data, attack injection, and human–autonomy interaction in VR, enabling systematic evaluation of AV safety under adversarial conditions.
Status: Under review at IEEE S&P 2027
A framework designed to detect APT attack patterns leveraging the power of self-attention in transformers. We incorporate customized embedding layers to effectively capture the context of event sequences derived from provenance graphs. While acknowledging the computational overhead associated with training transformer networks, our framework surpasses existing LSTM and Language models regarding APT detection performance. We integrated the model parameters and training procedure from the RoBERTa model and conducted extensive experiments on well-known APT datasets (DARPA OpTC and DARPA TC E3). Our framework achieved superior F1 scores of 98% and 95% on the two datasets respectively, surpassing the F1 scores of 96% and 94% obtained by LSTM models. Our findings suggest that LogShield's performance benefits from larger datasets and demonstrates its potential for generalization across diverse domains.
Status: ArXiv