Publications

2023

  • Book:

    • Modern Data Pipelines Testing Techniques: A Visual Guide (book page)

  • Blog posts and Conference Talks:

    • Modern Data Pipelines Testing Techniques: Why Bother? 1/3 (blog post)

    • Modern Data Pipelines Testing Techniques: Why Bother? 2/3 (blog post)

    • Modern Data Pipelines Testing Techniques: Why Bother? 3/3 (blog post)

2022

  • Work in progress Book:

    • Modern Data Pipelines Testing Techniques: A Visual Guide (book page)

  • Blog posts and Conference Talks:

2021

2020

  • Blogposts:

    • ML Feature Stores: A Casual Tour. Part 1: (blog post)

    • ML Feature Stores: A Casual Tour. Part 2: (blog post)

    • ML Feature Stores: A Casual Tour. Part 3: (blog post)

    • Seven Signs You Might Be Creating ML Technical Debt (blog post)

    • KDD 2020 Conference Highlights: (blog post)

  • Book:


2019

  • MRR vs MAP vs NDCG: Rank-Aware Evaluation Metrics And When To Use Them. (blog post)

  • Clean Machine Learning Code: Practical Software Engineering Principles for ML Craftsmanship. Towards Data Science Publication. (blog post)

  • Testing ML Code: How Scikit-learn Does It. Analytics Vidhya Publication. (blog post)

  • Avoiding the “Automatic Hand-off” Syndrome in Data Science Products. Towards Data Science Publication. (blog post)

  • Deep Learning for Recommendation Systems circa 2018–19: A Navigation Map. (blog post)

  • k8s-workqueue: Simplified Kubernetes Batch Jobs For Data Science Use Cases. Xandr Tech Publication. (blog post)

    • Presented in the 2019 International Conference on Machine Learning, Predictive Applications and APIs (PAPIs 2019)

2018

  • Moussa, Taifi, et al. "Lessons Learned from Building Scalable Machine Learning Pipelines", 2018, International Conference on Machine Learning, Predictive Applications and APIs, (blog post, recording) (to appear in PMLR)

  • Structuring a “Docker for Data Science” Training Journey. Appnexus Tech Publication. (blog post)

  • Introduction to PyTorch Model Compression Through Teacher-Student Knowledge Distillation (blog post)

  • Individual Team Contribution paper:

    • Sanzgiri, Ashutosh, et al. "Classifying Sensitive Content in Online Advertisements with Deep Learning", 2018, The 5th IEEE International Conference on Data Science and Advanced Analytics.

2016

  • Individual Team Contribution paper:

    • Austin, Daniel, et al. "Reserve price optimization at scale." 2016 IEEE 3rd International Conference on Data Science and Advanced Analytics (DSAA). IEEE, 2016.

2015

PhD thesis

  • M. Taifi, "Stateless Parallel Processing Architecture for Exascale and Auction-based HPC Clouds", PhD Thesis, Temple University, 2013

2013

2012

2011

2010

  • M. Taifi and Y.Shi, "GPU-CPU High Performance Computing Through Fault Tolerant Decoupling: Preliminary Results", Poster, Future of Computing Symposium,Temple University, March 2010.

2009

  • M. Taifi and Y.Shi, "How to achieve a 47000x speed up on the GPU/CUDA using matrix multiplication," Technical Report, Amax corporation, June 2009.

  • M. Taifi and Y.Shi, "Performance Prediction and Evaluation of a Solution Space Compact Parallel Program using the Steady State Timing Model", Poster, CST Student Research Symposium,Temple University, March 2009.

2008

Master Thesis

  • M. Taifi, "Opinion Mining", Master Thesis, Lappeenranta University of Technology, 2008


Journals

Conferences

Workshops

Extended abstracts