I am a Computer Vision Researcher and Senior Software Engineer (Machine Learning) at Silicon Orchard Ltd, focusing on multimodal clustering, physics guided image clustering, anomaly detection in satellite images, multimodal large language models, agentic applications, zero-shot learning, automated tool generation for LLM agents, and risk prediction in AI-generated code.

Previously, at Apurba Technologies, I specialized in Optical Character Recognition (OCR), developing solutions for text recognition, document layout analysis, and post-OCR corrections using LLMs. My work also included leveraging GANs for image enhancement and building multimodal AI for healthcare applications.

Recently, I’ve developed an interest in 3D view synthesis and object reconstruction. I am exploring techniques such as 3D reconstructions (SfM, MVS, SLAM), camera parameter estimation(PnP, bundle adjustment), novel view synthesis (NeRF, 3DGS), Radiance Field, and 3D rasterization, with a focus on improving performance, accuracy, and detail in 3D object reconstruction.

I hold a Bachelor’s degree in Computer Science and Engineering from Daffodil International University and am driven by a passion for innovation and advancing AI and Computer Vision technologies.


News
  • Our team has been selected as a semi-finalist in the CivX Data to Decision Challenge: Harnessing Digital Twin Data for Intelligent Decisions.
  • Our work on Advanced Tool Learning and Selection System (ATLASS) is accepted as a full paper for presentation in CISOSE 2025.
  • Starting new position as a Senior Software Engineer (Machine Learning) at Silicon Orchard Ltd.
  • Our work on the Enhancement of Bengali OCR is Accepted at WACV 2024, Workshop on Vision-Based Understanding for Low-Resource Languages.
  • I am serving as a Program Committee for ICMLA 2024.
  • Our work on OCR for Versatile Documents is Accepted at IEEE BigData 2023.
  • Our work on the biggest Bangla Character and Word Level Dataset is Accepted at the EMNLP 2023 Industry Track.

Publications

Gold Standard Bangla OCR Dataset: An In-Depth Look at Data Preprocessing and Annotation Processes

Hasmot Ali, AKM Shahariar Azad Rabby, Md Majedul Islam, A.k.m Mahamud, Nazmul Hasan, Fuad Rahman

EMNLP 2023 Industry Track [Paper] [Poster]  

Covid-19 Dataset: Worldwide spread log including countries first case and first death

Hasmot Ali, Md Fahad Hossain, Md Mehedi Hasan, Sheikh Abujar

Data in Brief, Volume 32, October 2020 [Paper]  

BanglaSenti: A Dataset of Bangla Words for Sentiment Analysis

Hasmot Ali, Md. Fahad Hossain, Shaon Bhatta Shuvo, Ahmed Al Marouf

ICCCNT 2020 [Paper]  

Enhancement of Bengali OCR by Specialized Models and Advanced Techniques for Diverse Document Types

AKM Shahariar Azad Rabby, Hasmot Ali, Md. Majedul Islam, Sheikh Abujar, Fuad Rahman

WACV 2024, Workshop [Paper]  

Versatile Bengali OCR: Document Analysis Technique for Varied Document Styles and Content

Akm Shahariar Azad Rabby, Hasmot Ali, Md. Majedul Islam, Fuad Rahman

IEEE BigData 2023 [Paper]  

Advanced Tool Learning and Selection System (ATLASS): A Closed-Loop Framework Using LLM

Mohd Ariful Haque, Justin Williams, Sunzida Siddique, Md. Hujaifa Islam, Hasmot Ali, Kishor Datta Gupta, Roy George

IEEE ICSOSE 2025 [Paper]  

A Deep Transfer Learning-Based Approach to Detect Potato Leaf Disease at an Earlier Stage

Md Rahmatul Kabir Rasel Sarker, Nasrin Akter Borsha, Md. Sefatullah, Azizur Rahman Khan, Somaiya Jannat, Hasmot Ali

ICAECT 2022  

A Machine Learning Approach to Recognize Speakers Region of the United Kingdom from Continuous Speech Based on Accent Classification

Md. Fahad Hossain, Md. Mehedi Hasan, Hasmot Ali, Md Rahmatul Kabir Rasel Sarker, Md. Toukirul Hassan

ICECE 2021 [Paper]  

Preprocessing of Continuous Bengali Speech for Feature Extraction

Md. Mehedi Hasan, Hasmot Ali, Md. Fahad Hossain, Sheikh Abujar

ICCCNT 2020 [Paper]