Rajasvi Vinayak Sharma

Machine Learning Engineer at Signos.

As a full-time Machine Learning Engineer at Signos, I architect and deploy end-to-end ML solutions combining deep learning models, LLMs, and recommendation systems. I've designed predictive models that directly impact over 10,000 users, built sophisticated RAG pipelines using LangChain for processing millions of meal logs, and developed hybrid recommendation engines that leverage multiple data signals.

I graduated from University of California - San Diego with a master's in Electrical & Computer Engineering, majoring in Machine Learning & Data Science. I have worked on projects that involve natural language processing, time series analysis, and recommendation systems, and have experience with big data technologies such as Apache Flink and PySpark. I am also familiar with tools for MLOps, such as Kubeflow and MLflow, and have experience working with cross-functional teams to integrate machine learning into business processes.

I have experience as a Data Scientist Intern at Nvidia's Cloud Gaming Analytics Team (GeForce Now), where as a Summer Intern I developed a time-series anomaly detection tool and improved in-place A/B test analysis using causal inference ML models. Furthermore, I also worked on building metrics for tracking user engagement through regression analysis and other data science tools.

Prior to joining graduate school, I have worked as Data Scientist/Analyst in industry for 3+ years at Goldman Sachs where I delved in solving Big Data engineering and NLP problems for improving search engine capabilites and survelliences. Furthermore, I built CRF models for entity recognition and enriched the company's knowledge graph.

Before working full-time I was pursing my Bachelors at Indian Insitute of Technology(Banaras Hindu University), Varanasi. During my pre-final year of undergaduate studies, I interned at Bixby AI team of Samsung R&D Institute - Noida working on optimising CNN architectures for porting ML models to android ecosystem. In my sophomore year, I did a research intern at the High-Performace Computing Lab of Indian Institute of Space Science and Technology (IIST).

Feel free to check out my Resume and drop me an e-mail if you want to chat with me!

~ Email | Resume | Github | LinkedIn ~

April'23	Joined Signos as full-time Machine Learning Engineer.
March'23	Graduated with a M.S. degree from University of California, San Diego
Sept'22	Started working as Teaching Assistant for the course MGTP-495: Accounting Data Analytics for the Fall 2022.
June'22	Joined Nvidia's Cloud Gaming Analytics Team as a Data Scientist Summer Intern!
April '22	Started working as a Teaching Assistant for course MGTA-495: Accounting Data Analytics under Prof. Mario Milone.
Jan '22	Began working as a Teaching Assistant for course MGTA-455: Customer Analytics under Prof. Vincent Nitjs.
Sep '21	Started Fall 2021 Quarter - U.C. San Diego!
July '21	Completed my 3 years working as a full-time Analyst at Goldman Sachs.

	Machine Learning Engineer \| Signos April '23 - Present Designed deep learning model for post-meal glucose spike prediction and exercise recommendations, helping >10,000 users minimize glucose spikes and lose weight Built RAG pipelines using LangChain for feature extraction from >1M meal logs to improve prediction accuracy Developed hybrid collaborative filtering recommendation engine using continuous glucose data and user activity logs
	Data Scientist Intern \| Nvidia June '22 - Sept '22 Developed Time-series Anomaly Detection tool to alert about malicious activities in 1000+ categories across 10M+ gaming sessions; Reduced response time from few months to 1 week. Improved in-place A/B test analysis by creating tool to identify most affected sub-population using causal inference ML models like S-Learner, T-Learner, Double ML methods etc. (using CausalML). Built user engagement metrics using regression analysis & game completion modelling to identify & target disengaged down-grading users. Able to track user’s local & absolute engagement while progressing in a game
	Data Scientist, Analyst \| Goldman Sachs July '18 - Aug '21 Working at the Search Engineering Team of Goldman Sachs, I have contributed to developing a real-time Big Data pipeline processing internal e-communications, i.e., emails, Skype IMs, etc., for the in-house natural language surveillance search engine. Mainly, I was responsible for enriching the e-communications pipeline by building scalable Machine Learning signals like spam/automated classifiers and mining trade-related information used to improve our search engine capabilities.
	Summer ML Intern \| Samsung R&D Institute - Noida May '17 - July '17 Worked in Bixby AI Team, building and optimizing existing image-classification models, CNN architectures using Tensorflow mainly for creating portable models to be integrated with android ecosystem to classify images offline (based on CIFAR-10 dataset). Custom-built model achieved an accuracy of 82% and occupied mere 7kb on phone with offline prediction capability.
	Summer Intern \| Indian Institute of Space Science & Technology (IIST) May '16 - July '16' Worked under the supervision of Dr. Sumitra S. at the High Performance Computing Lab. Studied ML theory & implemented algorithms specifically ensembles such as Random Forest, AdaBoost etc. from scratch. Performed comparative performance analysis and verified findings based on research paper Fernandez-Delgado et al.

University of California - San Diego

Master of Science | Electrical & Computer Engineering
Major: Machine Learning & Data Science
Sep '21 - Present

Relevant Coursework:
Deep Learning for Natural Language Understanding • Recommender Systems and Data Mining • Statistical Natural Language Processing • Data Mning & Big Data Analytics •Convex Optimization • ML: Learning Algorithms •Statistical Learning

Indian Institute of Technology (Banaras Hindu University), Varanasi

Bachelor of Technology | Electronics Engineering
July '14 - May '18

	Clickbait Spoiler Generation using Question Answering PyTorch, Hugging Face, Plotly-express, Jupyter Notebook \| Sep. 2022 - Dec. 2022 [code][report] In this project Clickbait spoiling aims at generating short texts that satisfy the curiosity induced by a clickbait post. This project is a derivative of Clickbait Challenge 2023 at SemEval 2023. This work aims to solve 2 subtasks from challenge i.e. spoiler classification (phrase, passage,multi-line) and generation. Primarily I worked with various Hugging face Question Answering and Paragraph Ranknig models to generate the spoilers for clickbait articles.
	Sequence Tagging with Hidden Markov Model Python, Pandas, Plotly-express, Jupyter Notebook \| Mar. 2022 - May. 2022 Developed Tri-gram HMM class with Viterbi algorithm decoding and finding emission and transition probabilities for sequence tagging. Extended HMM model with various smoothing techniques like Laplace, Katz Back-off, & Linear Interpolation. Implemented context aware N-gram language model (LM) class and performed analysis on Out-of-Domain & In-Domain text.
	Named Entity Recognition (NER) with BiLSTM CRF PyTorch, Pandas, Plotly-express, Jupyter Notebook \| Mar. 2022 - May. 2022 Built custom model with BiLSTM for feature representations & combined with CRF for CoNLL-2003 NER Shared Task. Implemented forward algorithm utilizing emission & transition potentials to compute partition function & Viterbi algorithm for decoding.
	Neural Collaborative Filtering for Recommendation Systems PyTorch, Pandas, Plotly-express, Jupyter Notebook \| Jan. 2022 - Mar. 2022 [code][report] In this project, I worked on implementing Neural Collaborative Filtering models in PyTorch from the original paper: He, Xiangnan, et al. "Neural collaborative filtering." Proceedings of the 26th international conference on world wide web. 2017. Through this process I set out to build primarily 3 models: Generalized Matrix Factorization (GMF), Multi-Layer Perceptron (MLP) and Neural Matrix Factorization (NMF) model. Furthermore after implementation performed prediction & comparative evaluation on the 1M MovieLens dataset.
	Mini Project Series \| AI:Learning Algorithms PyTorch, scikit-learn, Pandas, Plotly-express, Jupyter Notebook \| Jan. 2022 - Mar. 2022 [code][report] I worked on various Mini Projects exploring/experimenting various ML algorithms. Please find below the details for individual projects: Prototype Selection algorithm 1-NN \| Coordinate Descent
	Adverse Food Events Analysis Pandas, Numpy, Plotly, Jupyter Notebook \| Sep. 2021 - Dec. 2021 [code][slides] Project focused on detailed analysis of FDA data of Adverse Food Events from 2004-2020 to help users beware of potential health risks before purchasing a product. Extensive EDA on serious outcomes revealed factors like categories, age responsible for adverse events.
	Bloomberg Trader Chat Analysis to predict location & jurisdiction violations Spark-NLP, PySpark, sckit-learn, Spacy \| Jan 2021 - April 2021 Extracted semantic & temporal information from Goldman’s trader conversations (>6 million per day) to build ML model resolving external traders geographic location with 78% precision and determine possible jurisdiction violations. Used various data-mining techniques to extract relevant entites which contributed in feature engineering and model optimization.
	Brawlhalla Elo-Tracker webapp Node.js, React, MERN stack \| Feb. 2021 - Apr. 2021 [code][site] Developed webapp based on MERN Stack, utilising Brawlhalla API to track ratings of players; deployed independent frontend - backend server on Netlify \& Heroku platform. Added capability to keep track of individual as well as group rating in interactive fashion along with leaderboard based view.

Graduate Teaching Assistant | MGTA-455: Customer Analytics
Jan '22 - Mar '22 | Jan '23 - Mar '23

Working under the supervision of Prof. Vincent Nijs at the Rady School of Management - UC San Diego. The course focuses on teaching modern analytics driven approachs (use of data, statistics, and machine learning) to marketing and provides the hands-on experience required to collect, analyze, and act on customer data.

Graduate Teaching Assistant | MGTA-495: Accounting Data Analytics
Mar '22 - June '22 | Sept '22 - Dec '22

Working under the supervision of Prof. Mario Milone at Rady School of Management - UC San Diego. The course covers a wide range of statistical and artifical intelligence tools, from regression analysis to machine learning as well as reinforcement learning, with an emphasis on the use of these tools in the accounting profession and accounting data.

This template is a modification to Jon Barron's website. Find the source code to my website here.