MLOPS ENGINEER (LLM SERVING AND INFRASTRUCTURE)
Descrição da oferta de emprego
Your Mission.
At CloudWalk, we're at the cutting edge of AI, pioneering the use of Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) to drive innovation.
As a MLOps Engineer, you will play a critical role in operationalizing the visionary work of our LLM Data Scientists.
Your expertise will ensure the smooth deployment, efficient management, and scalable performance of LLMs across our extensive infrastructure.
Your contributions will turn advanced AI research into scalable, high-performance solutions, with a particular focus on optimizing network communication and parallel processing capabilities.
\ What You’ll Do.
Deploy and Manage LLMs.
Employ Kubernetes, Terraform, and cloud services to deploy and scale LLMs efficiently, ensuring their adaptability to high-demand scenarios.
Optimize Computing Infrastructure.
Focus on enhancing GPU utilization, distributed training, bandwidth efficiency between machines, and VPC connections to maximize system performance.
Leverage Cutting-Edge Technologies.
Utilize libraries such as Hugging Face's Accelerate and PyTorch's torchrun to facilitate parallel training across multiple machines in a cluster, optimizing our AI models' training and inference processes.
Collaborate on Innovation.
Partner with our R&D team to transition LLM and RAG technologies from conceptual stages to scalable, production-ready systems.
Monitor and Improve System Performance.
Implement advanced monitoring and logging practices to ensure system reliability and performance, continuously seeking improvements.
Stay Updated on Industry Advances.
Actively pursue the latest developments in MLOps, cloud computing, and AI technologies to implement innovative solutions and maintain our infrastructure's leading edge.
Technologies You Will Work With.
Kubernetes, Terraform, and cloud computing platforms for scalable AI model deployment.
CI/CD pipelines, Git for version control, and Bash scripting for operational efficiency.
Hugging Face's Accelerate and PyTorch's torchrun for parallel training and optimization across multiple machines.
A comprehensive understanding of network infrastructure to optimize bandwidth and secure VPC connections is essential.
What We Expect From You.
Technical Mastery.
Solid experience with DevOps, cloud infrastructure, and deploying machine learning models.
Expertise in network optimization and parallel computing is crucial.
Problem-Solving Mindset.
The ability to navigate complex challenges, strategically manage resources, and improve system efficiency.
Collaborative Approach.
Strong communication skills and the ability to contribute effectively within a dynamic, interdisciplinary team.
Lifelong Learner.
A commitment to continuous learning, staying abreast of the latest technological advancements, and applying innovative solutions.
\ Why CloudWalk? By joining CloudWalk, you become part of a team that's reshaping the future with technological innovations.
We cherish creativity, teamwork, and a dedication to excellence.
Here, your work contributes to a mission of driving forward technological advancements.
Dare to innovate, dare to impact, dare to join the Wolfpack.
Apply now!
Detalhes da oferta
- CloudWalk
- Indeterminado
- 13/12/2024
- 13/03/2025
Acknowledge and resolve customer complaints, ensuring appropriate follow-up and escalation when necessary... process customer orders, forms, applications, and requests in a timely and accurate manner... responselink offers competitive compensation, a supportive work environment, and opportunities for......
Maintain and update employee records and database... prepare and maintain employment contracts... develop and implement hr policies and procedures and work closely with local and global stakeholders ensuring a smooth flow of information... preparing monthly and annual financial statements and day to......
Knowledge and adherence to regulatory compliance's... market expansion for medical equipment sales, competition analysis, client and stakeholder management, proposal developments and negotiations... technical and commercial aspects of the offerings from biorad medisys latam ltda: equipment, urology -......
Administrative and technical departmentroustabouts / roughnecks, welders / mechanics, rig operators / drillers, engineers (petroleum and mechanical) health and safety officer, superintendent smp, smp supervisor, document controller clerk, community superintendent, training / hrd superintendent; smp engineer......
Training clients and actual presence in or... preventive maintenance and corrective maintenance as per schedule... you will be responsible for technical and commercial aspects of the offerings: equipment, urology - endoscopes, lasers and other urology equipment’s... requisitos do trabalho requirements......
Infrastructure as code development for multi-environment and multi- account applications... proficiency in aws sdks, cli, and technologies like python and git... implementation of continuous compliance and security in the organization and the cloud... infrastructure as code deployments with cloud......
Lead a portfolio of diverse technology projects and a team of developers with deep experience in distributed microservices, and full stack systems to create solutions that help meet regulatory needs for the company • share your passion for staying on top of tech trends, experimenting with and learning......
Requirements: candidates must be honest, hardworking and responsible... driver: to drive our children to and from school... me and my spouse are always busy at work being medical doctors with our private hospital and we need domestic staff to work for us and to take care of our children......
Experience integrating and troubleshooting hardware terminals including following emv and msr specifications... solid understanding of api design, particularly principles of rest in highly scalable environments strong sql skills: sql server, mysql, and specifically with postgresql, including pros/cons......
Contract drafting and review:•draft, review, and negotiate contracts, agreements, and legal documents related to real estate transactions, including purchase agreements, lease agreements, and deeds... risk management:•identify and assess legal risks associated with real estate transactions and operations......