This job has expired

Principal Data Scientist

Labcorp Drug Development
Raleigh, NC
Closing date
Sep 20, 2023

View more

Science, Mathematics and Statistics
Organization Type
To produce innovative solutions driven by exploratory data analysis from complex and high-dimensional datasets. Apply knowledge of statistics, data modeling, data science and artificial intelligence to recognize patterns, identify opportunities and make valuable discoveries. Use a flexible, analytical approach to design, develop, and evaluate predictive models. Generate and test hypotheses. Own team data science expertise and provide mentorship and guidance to the team.

  • Leadership in modeling, technology, technical product development / design, and technical product ownership.
  • Mentor junior team members.
  • Refine and develop team best practices.
  • Develop and deliver presentations to communicate technical ideas and analytical findings to non-technical partners and senior leadership, including underwriters and IT professionals.
  • Build underlying software infrastructure to better manage, integrate and mine the data that LabCorp processes daily.
  • Work closely with engineering teams and with some supervision participate in the full development cycle from product inception, research and prototyping to release in production.
  • Write production quality code while implementing your own ideas.
  • Drive project execution and implement a robust plan for measuring success. Develop success metrics to be used across projects for the team.
  • Interpret data and learn how to present to others the stories they represent through rich and intuitive visualizations.
  • Develop novel ways of integrating, mining and visualizing diverse, high dimensional and poorly curated data sets.

  • Experience in data analysis and statistical learning
  • Experience with statistical methodologies and machine learning techniques such as: neural networks, graphical models, ensemble methods and natural language processing
  • Proficiency with Python and R is highly desirable
  • Strong data visualization skills
  • Familiarity with one or more machine learning libraries or frameworks such as: PyTorch, TensorFlow, scikit-learn
  • Programming experience in Java, Python or Perl and experience with rational and non-structure databases is highly desirable
  • Technical proficiency and demonstrated success in scientific creativity, collaboration with others and independent thought
  • Ability to collaborate with the team and translate existing research into practical solutions and products ability to build and manage relationships with various collaborators across and outside the company
  • Comfortable working with both technical and non-technical staff to translate concepts and algorithms into working prototypes
  • Drive project execution and implement a robust plan for measuring success
  • Lead discussions with senior leadership at the department or functional level

  • Experience in artificial intelligence and statistical learning
  • Experience with statistical methodologies and machine learning techniques such as: neural networks, graphical models, ensemble methods and natural language processing
  • Experience with multiple deep learning techniques such as CNN, LSTM, RNN, etc., in addition to standard machine learning approaches such as those found in scikit-learn
  • Master of evaluation techniques for supervised and unsupervised techniques. Knows to evaluate the quality of data and determine gaps in data or assumptions. 
  • Proficiency with Python. Can develop meaningful python code using objective oriented programming and functional programming. Writes tests for code.  Can debug errors quickly. 
  • Strong data visualization skills.
  • Familiarity with one or more machine learning libraries or frameworks such as: PyTorch, TensorFlow.
  • Experience with rational and non-structure databases is highly desirable.
  • Experience using cloud technologies such as AWS with tools such as S3, Lambda, Athena, API Gateway, SageMaker, Glue
  • Languages: Python, R, SQL, Spark (Pyspark)
  • Packages: Scikit-learn, Pandas, Numpy, Scipy, TensorFlow, PyTorch, SpaCy, Snorkel, H2O, Spark/PySpark, Spark Mllib, Lifelines, Matplotlib, Seaborn, Statsmodel, Theano, Keras, Nltk, fasttext, Gensim, Opencv, Prophet, Matplotlib, Plotly, JupyterLab
  • Cloud: S3, Athena, Glue, EC2, Sagemaker, Step functions, API Gateway, Jenkins, Lambda, Urban, Code Deploy, Artifactory, Veracode, Kubernetes, Docker, ECS, EKS, Git/Bitbucket, Hugginface, Quicksight, VPC, SNS, EBS, Kinesis, iAM, Cloudwatch, Splunk, Ground Truth
  • Technologies: git, Jira
  • Techniques: Machine learning , Natural language processing ,Natural language understanding ,Transformer models ,Deep learning ,CNN ,LSTM ,RNN ,GAN (General adverserial networks), Deep reinforcement learning, Self-organizing maps, Autoencoders, Boltzman machines, Random Forest, XGBoost, Adaboost, Decision Tree, Active learning, Contextual bandit, Hadoop, A/B Testing, Exploratory Data Analysis (EDA), ETL (Extract Transform load), Data Warehouse, Hypothesis testing, Cross-validation, Fuzzy logic, Fuzzy matching, Probabilistic Graphical Modeling , Generative models ,Optical Character Recognition, Sequence-to-Sequence modeling, Kernel methods, Topological Data Analysis, AutoML, Dimensionality Reduction, Uniform Manifold Approximation and Projection, Objective function, Linear programming, Constrained, Programming, Graph Neural Networks, Message Passing Interface, Semi-supervised learning, Hierarchical learning, Stacked models, Generalized linear models, Explainable models, Shapley values, Causal Inference, Autoregression, Catboost models, One-shot learning, Transfer learning, Anomaly detection, Annotation ,Ensemble models, Churn models, Propensity models, Sentiment analysis, Tokenization, Named entity recognition, Span detection, Question-answer models, Part-of-speech tagging, Stemming, Lemmatization, Classification, Regression, Bayesian inference, Causal modeling, Graph databases, MLOps, Fraud Detection, Geospatial analysis, Geospatial clustering, Network analysis, Data lake, Feature lake, Data engineering, Chatbots, Synthea, Physionet, Real-time streaming, IoT (internet of things), Time series forecasting, Biomedical text mining, Unsupervised modeling, Principal Component Analysis, Isolation Forest, KNN, SVM, Clustering, Feature engineering, Feature selection, Lasso, Ridge, Linear regression, Logistic regression, Support Vector Machines, Cox Proportional hazard models, Reinforcement Learning
  • Clinical: EMR (Electronic Medical Record), IRB / IACUC, Clinical trial study design, Diagnostic testing, Epidemiology, Precision medicine, Diagnostics, Public health, Microbiology, Registry, Observational studies, Clinical Trial Recruitment, Risk modeling, Survival Analysis, HIPAA, PHI, GDRP, Digital Health, Human Factors, Econometrics, Translational research, Real-world data / Real-world evidence, FDA, NIH ,510k

  • Advance degree is required in Computer Science, Engineering, Statistics, Math or related field
  • Must be able to provide evidence of relevant research expertise in the form of presentations, software, technical publications, and/or knowledge of applications.
  • Master with at least 5 years' experience or Ph.D. with 3 years of experience in a Data science setting.

Pay Range: $132,254 - $201,100 Annually

Benefits: All job offers will be based on a candidate's skills and prior relevant experience, applicable degrees/certifications, as well as internal equity and market data. Regular, full-time or part-time employees working 20 or more hours per week are eligible for comprehensive benefits including: Medical, Dental, Vision, Life, STD/LTD, 401(K), ESPP, Paid time off (PTO) or Flexible time off (FTO), Commissions, and Company bonus where applicable. For more detailed information, please click here.

Labcorp is proud to be an Equal Opportunity Employer:

As an EOE/AA employer, Labcorp strives for diversity and inclusion in the workforce and does not tolerate harassment or discrimination of any kind. We make employment decisions based on the needs of our business and the qualifications of the individual and do not discriminate based upon race, religion, color, national origin, gender (including pregnancy or other medical conditions/needs), family or parental status, marital, civil union or domestic partnership status, sexual orientation, gender identity, gender expression, personal appearance, age, veteran status, disability, genetic information, or any other legally protected characteristic. We encourage all to apply.

For more information about how we collect and store your personal data, please see our Privacy Statement.
Job Summary
Job number: 2328481
Date posted : 2023-08-02
Profession: Information Technology
Employment type: Full-Time

Get job alerts

Create a job alert and receive personalized job recommendations straight to your inbox.

Create alert