Hasmot Ali

Computer Vision Researcher focusing on Multimodal AI, Agentic Applications, Multimodal Clustering, Physics Guided Image Clustering, Mitigation of Model & Data Biases.

I am a Computer Vision Researcher and Senior Software Engineer (Machine Learning) at Silicon Orchard Ltd, focusing on multimodal clustering, physics guided image clustering, anomaly detection in satellite images, multimodal large language models, agentic applications, zero-shot learning, automated tool generation for LLM agents, and risk prediction in AI-generated code.

Previously, at Apurba Technologies, I specialized in Optical Character Recognition (OCR), developing solutions for text recognition, document layout analysis, and post-OCR corrections using LLMs. My work also included leveraging GANs for image enhancement and building multimodal AI for healthcare applications.

Recently, I’ve developed an interest in 3D view synthesis and object reconstruction. I am exploring techniques such as 3D reconstructions (SfM, MVS, SLAM), camera parameter estimation(PnP, bundle adjustment), novel view synthesis (NeRF, 3DGS), Radiance Field, and 3D rasterization, with a focus on improving performance, accuracy, and detail in 3D object reconstruction.

I hold a Bachelor’s degree in Computer Science and Engineering from Daffodil International University and am driven by a passion for innovation and advancing AI and Computer Vision technologies.

News

Our work “Continuous Monitoring of Large-Scale Generative AI via Deterministic Knowledge Graph Structures” has been accepted on AAAI 2025 Fall Symposium Series.
Our team has been selected as a semi-finalist in the CivX Data to Decision Challenge: Harnessing Digital Twin Data for Intelligent Decisions.
Our work on Advanced Tool Learning and Selection System (ATLASS) is accepted as a full paper for presentation in CISOSE 2025.
Starting new position as a Senior Software Engineer (Machine Learning) at Silicon Orchard Ltd.
Our work on the Enhancement of Bengali OCR is Accepted at WACV 2024, Workshop on Vision-Based Understanding for Low-Resource Languages.
I am serving as a Program Committee for ICMLA 2024.
Our work on OCR for Versatile Documents is Accepted at IEEE BigData 2023.
Our work on the biggest Bangla Character and Word Level Dataset is Accepted at the EMNLP 2023 Industry Track.

Publications

Gold Standard Bangla OCR Dataset: An In-Depth Look at Data Preprocessing and Annotation Processes

Hasmot Ali, AKM Shahariar Azad Rabby, Md Majedul Islam, A.k.m Mahamud, Nazmul Hasan, Fuad Rahman

EMNLP 2023 Industry Track [Paper] [Poster]

Covid-19 Dataset: Worldwide spread log including countries first case and first death

Hasmot Ali, Md Fahad Hossain, Md Mehedi Hasan, Sheikh Abujar

Data in Brief, Volume 32, October 2020 [Paper]

BanglaSenti: A Dataset of Bangla Words for Sentiment Analysis

Hasmot Ali, Md. Fahad Hossain, Shaon Bhatta Shuvo, Ahmed Al Marouf

ICCCNT 2020 [Paper]

Enhancement of Bengali OCR by Specialized Models and Advanced Techniques for Diverse Document Types

AKM Shahariar Azad Rabby, Hasmot Ali, Md. Majedul Islam, Sheikh Abujar, Fuad Rahman