We are GARVIS, a revolutionary new startup, using the power of Artificial Intelligence and Machine Learning to redefine the effectiveness with which organizations can understand and predict their customers. We start with revolutionizing Demand Planning by making sense of all data inside and outside the organization and putting the planner in control. Our customer base is rapidly expanding, which means that now is the right time for scale our core product team.
We are looking for a Director for our Platform engineering team. An ideal candidate must be a highly motivated engineer and technical leader who has a broad range of experience in cloud computing, software development, deployment, and DevOps, as well as understanding of how enterprise grade IT infrastructure works. This person would be involved in architecture, design, prototype, and development of the various aspects of development including building, automating, testing, and maintain the GARVIS platform and ensuring the most effective software development process
At the heart of the GARVIS platform is a distributed, high-performance engine that is capable of training/inferencing of AI models, prediction/execution of rule-based systems and querying/visualization of big data. The engine is packaged along with a responsive, web-based user interface and a scalable, secure API backend; containerized and deployed into the cloud. As an architect of a multi-tenant, cloud-native and multi-cloud, AI-driven and highly scalable platform, this role is crucial within our organization. You will be part of the core engineering team of an exciting software as a service startup. You will have the opportunity to apply state-of software development tooling, cloud and infrastructure technologies, work with our developers to bring new features and services into production and scale our infrastructure to meet rapidly increasing demand for our core SaaS product.
The job requires you to lead the path as the technical leader for the team and would require frequent interaction with other cross functional teams (Product Management, Sales, Marketing, and Customer support) to design, build and lead from the front. In this role, you will design and implement strategies for scaling, security, compliance, continuous integration, testing, delivery, monitoring, and feedback of the GARVIS software platform.
Responsibilities:
- Provide organizational leadership for the GARVIS platform team.
- Develop long term strategic vision partnering with the executive team.
- Work closely with product, operations, quality, customer support and teams.
- Help drive design and build distributed, scalable, fault-tolerant software systems
- Participate in the entire software lifecycle – development, testing, CI, and production operations
- Balance between product feature development and production operational concerns like writing run books, ops automation, structured logging, instrumentation for metrics and events
- Dive deep into service scale issues, ensure and promote best practices on resiliency and reliability of our services.
- Work closely with customers to understand and resolve complex platform issues.
- Take the initiative and be responsible for delivering complex software by working effectively with the team and other stakeholders.
- Hire, build and mentor the best talent for the company.
Required Qualifications:
- You have prior experience working on internet scale distributed systems.
- You have 5+ years of experience as a manager of large teams working on cloud services.
- You have 10+ years of experience with Cloud Computing, System Design, and Object-Oriented Design.
- You have production experience in cloud-based ML/AI systems.
- Deep understanding of data structures, algorithms, and excellent problem-solving skills.
- Demonstrated ability to work in an ambiguous environment and build strategy and organizational buy-in without a clear path ahead.
Preferred Qualifications:
- Experience building RESTful microservices and deep understanding of building cloud-based services (multi-tenant architecture, Autoscaling, Autonomous driven system, monitoring on the fly, run long running high compute process)
- Experienced at building highly available services, possessing knowledge of common service-oriented design patterns and service-to-service communication protocols
- You are familiar with components of modern infrastructure like service discovery, secret storage, software-defined networking, etc.
- Strong knowledge of Docker/Kubernetes to build and deploy using Terraform, Ansible.
- You have experience with production operations and good practices for putting quality code in production and troubleshoot issues when they arise
- Hands-on experience with Data Science, Machine Learning and Statistical systems and Python packages such as pandas, scipy, scikit-learn, pytorch, dask, pyarrow, ray is a strong plus.
- Experience working in software engineering, and can demonstrate best practices for project management, quality control, and product development.
- Proven track record of collaborative development in an agile team environment.
- Excellent verbal and written communication skills in English.
At Garvis, we really value team spirit: transparency and frequent communication is key to flourish. This is not limited by hierarchy, distance, or function and there is a huge opportunity to grow and become part of the company’s success story. Garvis is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations, and ordinances.