I am a Computer Vision Researcher and Senior Software Engineer (Machine Learning) at Silicon Orchard Ltd, focusing on multimodal clustering, physics guided image clustering, anomaly detection in satellite images, multimodal large language models, agentic applications, zero-shot learning, automated tool generation for LLM agents, and risk prediction in AI-generated code.
Previously, at Apurba Technologies, I specialized in Optical Character Recognition (OCR), developing solutions for text recognition, document layout analysis, and post-OCR corrections using LLMs. My work also included leveraging GANs for image enhancement and building multimodal AI for healthcare applications.
Recently, I’ve developed an interest in 3D view synthesis and object reconstruction. I am exploring techniques such as 3D reconstructions (SfM, MVS, SLAM), camera parameter estimation(PnP, bundle adjustment), novel view synthesis (NeRF, 3DGS), Radiance Field, and 3D rasterization, with a focus on improving performance, accuracy, and detail in 3D object reconstruction.
I hold a Bachelor’s degree in Computer Science and Engineering from Daffodil International University and am driven by a passion for innovation and advancing AI and Computer Vision technologies.
- Our team has been selected as a semi-finalist in the CivX Data to Decision Challenge: Harnessing Digital Twin Data for Intelligent Decisions.
- Our work on Advanced Tool Learning and Selection System (ATLASS) is accepted as a full paper for presentation in CISOSE 2025.
- Starting new position as a Senior Software Engineer (Machine Learning) at Silicon Orchard Ltd.
- Our work on the Enhancement of Bengali OCR is Accepted at WACV 2024, Workshop on Vision-Based Understanding for Low-Resource Languages.
- I am serving as a Program Committee for ICMLA 2024.
- Our work on OCR for Versatile Documents is Accepted at IEEE BigData 2023.
- Our work on the biggest Bangla Character and Word Level Dataset is Accepted at the EMNLP 2023 Industry Track.