cv

General Information

Full Name Himanshu Thakur
Location Pittsburgh, PA
Email hthakur [AT] andrew [DOT] cmu [DOT] edu
Languages English, Hindi, German

Education

  • 2023
    Master of Computational Data Science
    Carnegie Mellon University
    • Select Courses
      • Large Language Models
      • Multimodal Machine Learning
      • Deep Reinforcement Learning
      • Introdution to Deep Learning
      • Introduction to Machine Learning (PhD)
      • Cloud Computing
      • Foundations of Computational Data Science
    • Academic Services
      • Reviewer for EMNLP 2023
      • Core Committee Member, LTI Computing Insfrastructure
  • 2020
    Bachelor of Computer Science and Engineering
    Vellore Institute of Technology, Vellore
    • Select Courses
      • Machine Learning
      • Artificial Intelligence
      • Image Processing
      • Neural Networks and Fuzzy Logic
      • Parallel and Distributed Computing
      • Virualization
      • Data Structures and Algorithms
      • Operating Systems
      • Databases
      • Networks and Communications
      • Calculus
      • Linear Algebra
      • Statics and Probability
      • Discrete Mathematics and Graph Theory
      • Theory of Computation and Compiler Design
    • Extracurricular Experiences
      • Researcher at ACM-VIT
      • Core Committee Member, SEDS-VIT
      • Founding Member at Team Qubits
    • Achievements
      • G. D. Naidu Young Scientist Award, 2020
      • Sir. M. Visvesaraya Award, 2019
      • Secured 1st position in 7 national hackathons
      • Ranked in Top 3 teams at 3 national hackathons
      • Secured 2 patents

Research Experience

  • May 2023 - Aug 2023
    Research Scientist Intern
    Abacus.AI
    • Conceptualized and led research on exploring linear properties of LLM adapters (soft prompts, LoRAs)
    • Invented a novel learning algorithm to approximate a linear combination of pre-trained LoRAs to enhance generalization and multi-task performance
    • Collaborated with researchers and developed new metrics for debunking staleness in popular NLP benchmarks due to memorization in LLMs
  • Jan 2023 - July 2023
    Graduate Research Assistant
    Robotics Institute, Carnegie Mellon University
    • Led and mentored a team of 6 research assistants for increasing generalization and robustness of computer vision tasks (segmentation, classification and tracking) for robotic e-waste disassembly at Biorobotics lab
    • Devised a multimodal deep neural network architecture for semantic segmentation
    • Developed a new loss function robust to label noise, improve baseline performance by 15% IoU and generalization 20% IoU
    • Implemented an out-of-distribution detection algorithm achieving 98.6% accuracy
  • Feb 2021 - Feb 2022
    Research Intern
    SketchX Lab, University of Surrey
    • Invented a novel active learning algorithm for fine-grained cross-modal instance-level retrieval tasks,
    • Increased mean top1 accuracy by 5% over state-of-the-art technique, published a first-authored paper to a top-tier conference (BMVC).

Work Experience

  • Aug 2020 - July 2022
    Senior Technology Associate
    Morgan Stanley
    • Developed a distributed deep learning framework for real-time failure prediction (time-series forecasting) and detection;
    • Led to 5000x reduction in failure detection time, enabled prediction of batch job failures with 94% accuracy, awarded best contributor to global firm resiliency out of 5000+ nominees.
    • Pioneered graph database as a service to enable 100x faster search and data lineage applications
    • Rsearched and developed a domain-agnostic recommendation engine (using large-scale and distributed training of deep learning models) powered by graph databases, recognized as most innovative firmwide project.
  • Dec 2019 - July 2020
    Data Scientist Intern
    Locale.AI
    • Developed a novel clustering algorithm for human activity discovery from raw geo-spatial ping data using semi-supervised representation learning, enabled automated discovery of 18% new business areas from previously unused data.
    • Productionized and scaled a timeseries anomaly detection framework using temporal convolutional networks (TCN), increased requests throughput by 6x, helped 10 companies in reliably predicting vehicle vandalism and trip delays.
  • Jan 2020 - June 2020
    Machine Learning and Web Development Intern (Part-Time)
    S4S Technologies
    • Worked on creating the entire internal operations dashboard for the team as well as a computer vision solution to solve food quality assurance problem.
  • May 2019 - July 2019
    Summer Intern
    Sigmaways Inc
    • Worked on AI based portfolio recommendation engine using LSTM based predictor and Meta-Heuristic Optimiser. Also, built a chatbot interface to the service.
  • May 2018 - July 2018
    Software Engineering Intern
    Nova (P & D)
    • Modernized the existing software capabilities and introduce a new method for speedy billing.
    • Worked on Data Analytics and Software Development to create a new Edge-Billing system which allowed multiple working staffs to finalise a bill on their smartphones itself.
    • Allowed multiple simultaneous billing, hence speeding up the process and developed software features which allowed better insights into sale and purchase.

Open Source Projects

  • 2020
    AI on the Beach
    • Developed a data visualization library that can render over 10 GB of data into a single animated scatter plot under minutes.

Honors and Awards

  • 2020
    • G. D. Naidu Young Scientist Award
  • 2019
    • Sir. M. Visvesaraya Award

Other Interests

  • Hobbies: Music Composition, Hiking, Travelling