G2i/ICA | Sr. Data Scientist - LLM | Remote (US) | Full-Time | https://jobs.ashbyhq.com/g2i/5a3d410c-2ce8-4332-a2cb-8bb4c60... (keep UTM please; tells us how wonderful HN is)
Agency hiring for a client through a fast-lane process. Permanent full-time position. US work authorization is needed due to the Public Trust Clearance requirement.
General Requirements - Experience - 7 to 8 years of overall professional experience in data science, analytics, or a related field
- At least 2–3 years of hands-on experience specifically focused on Large Language Models (LLMs) and related techniques (e.g., fine-tuning, instruction tuning, prompt engineering).
- Ability to obtain a Public Trust Clearance (Required for the role)
- Must be authorized to work in the United States.
Technical Proficiency:
- Coding knowledge and experience with at least one of the following languages: Python, C, R, Java, Scala, with strong proficiency in Python being highly preferred.
- Proven ability to use statistical computer languages (Python, R, SQL, etc.) for data manipulation, analysis, and model development.
- Knowledge of ML and data mining techniques (Regression, Deep Learning, NLP, Time Series Analysis, Social Media Analysis, etc.).
- Familiarity with AWS services (Athena, S3, Glue, SageMaker, Comprehend, Bedrock) for scalable model development.
- Preferred: exposure to MLOps tools, big data technologies (Hadoop, Spark), or other cloud services.
LLM-Focused Skills (Desired, not all required): - PEFT (LoRA/QLoRA) for efficient model fine-tuning. - Instruction Fine-tuning, Retrieval-Augmented Generation (RAG), Chain-of-Thought (CoT)/Tree-of-Thought (ToT). - Quantization, Pruning, and Knowledge Distillation strategies for model optimization. - Experience with Hugging Face Transformers, LangChain, Llama Index, and/or large-scale training frameworks. - Familiarity with LLM evaluation, interpretability, and best practices.