Parvez Ahammad's Website
  • About
  • News
  • Research
  • Blog
  • Code

Parvez Ahammad / Blog

SpeedPerception Challenge: Public Release

7/31/2016

1 Comment

 
Picture
Description: http://speedperception.com/
To play & participate in the challenge: http://speedperception.meteorapp.com/challenge

SpeedPerception is an open-source, collaborative effort to understand perceptual aspects of end-user web experience. Our goal is to create a free, open-source, benchmark dataset to advance the systematic study of how people perceive the webpage loading process – the above-the-fold rendering in particular. Our belief (and hope) is that this benchmark can provide a quantitative basis to compare different algorithms and spur computer scientists and other web performance engineers to make progress quantifying perceived webpage performance. We plan to open source all the collected data and analysis once there is sufficient participation. Please share this post – we need as many people to participate as possible.

-- Parvez Ahammad, Clark Gao, Prasenjit Dey, Estelle Weyl, Pat Meenan

1 Comment

Pairwise Correlations in HTTP Archive: Pearson or Spearman ? Part-1

7/19/2016

1 Comment

 
In our group,  we have been interested in the question of how the structure of a webpage influences its performance on the web. It is (in my opinion) one of the key questions at the heart of distributed web application delivery. Thanks to amazing resources like HTTP Archive and BigQueri.es, it is pretty straight forward to access and play with large scale web performance data (measured twice a month across 400,000+ websites and made available for free!). 

Probably the simplest (if not the best) model one could consider in studying structure-performance relationship of webpages is to look at the pairwise correlations between various measures of webpage structure (like number of images, amount of image or JavaScript bytes etc.) to various page-level performance metrics (like startRender, PageLoadTime / onLoad, Speed Index etc.). HTTP Archive has a nice set of visualizations on the Interesting Stats page that provide this information for a few chosen metrics. While going through these plots, Clark and I started to wonder what correlation measure HTTP Archive actually uses and whether it is an appropriate choice. I wrote to Steve Souders (the creator of HTTP Archive) and he responded very quickly with links to the code and some associated information. The code, and our own calculations  (using public HTTP Archive datasets) suggest that Pearson correlation is used in HTTP Archive for measuring pairwise structure-performance relationships. See the plots from Clark below, so that you can see for yourself.

Read More
1 Comment

Perceptual Speed Index (PSI) for measuring above-fold visual performance of webpages

7/2/2016

4 Comments

 
Picture
 Note: This blog post describes collaborative work that had contributions from myself, Clark Gao, Matthew Mok and Karan Kumar (all at Instart Logic Inc.).  The image above is borrowed from Paul Irish's talk slides with my hand drawn squiggly circle to highlight this blog post's focus.

Sometime last year when I started to dig deeper into web performance and how human end-users perceive the process of webpage loading, it quickly became clear that typical W3C standard metrics weren't sufficient. I also learned (painfully) that popular page-level metrics like onLoad could be easily gamed via scripting tricks. I became particularly fascinated with the problem of measuring above-the-fold webpage loading process - one that's almost a computer vision type of a problem (but my computer vision friends don't know about it yet). Couple of early discoveries kept me going:

Read More
4 Comments

"Rube Goldberg Machine Learning" comes to web performance analytics

6/30/2016

1 Comment

 
Picture
Note: The title of this article is a play on the term "Rube Goldberg machine". Acccording to Wikipedia, a Rube Goldberg machine is a contraption, invention, device, or apparatus that is deliberately over-engineered to perform a simple task in a complicated fashion, generally including a chain reaction. Keep that in your mind.

Everyone (that cares about ML) knows about supervised / unsupervised / semi-supervised learning pipelines. I have now come across an entirely new class of ML pipelines that I shall call "Rube Goldberg Machine Learning" pipelines. Before I go on and explain what I mean, let me provide some context.

Last week, I attended Velocity conference in Santa Clara, CA. For those of you who are unfamiliar, Velocity conference is a popular enterprise-oriented (non-academic) conference focused on web performance and DevOps topics. I was very excited to see a machine learning talk in the program: "Using machine learning to determine drivers of bounce and conversion". Apparently it was the first ML talk ever at this venue (for a field that produces insane amount of data, I don't know why ML doesn't show up more often in Velocity). So, yay! Given what I know about the prior work of Pat Meenan and Tammy Everts, I had high hopes for the talk and the potential findings. Neither of them are ML practitioners, but both of them have done stellar work in web performance community before. Unfortunately my excitement quickly dissipated within the first few slides (click the link to the slides if you are curious). Several web performance folks told me informally that the conclusions of the talk didn't seem right, because the conclusions appeared to go against the conventional wisdom in the web performance field. I have no problem with going against conventional wisdom; sometimes it is nice to correct long-held misconceptions if there's good evidence. My disappointment mainly stems from the misuse of ML models in this talk. Instead of simply venting, let me break down the good/bad/ugly aspects of this talk:

The Good:
  • It's the first ML talk at Velocity. Yay!
  • Tammy Everts and her team had put together a nice dataset of session by session web perf metrics for commercial websites, along with business-critical measures like conversion and bounce rates.
  • Pat Meenan correctly emphasized to the audience that barrier to entry for playing with ML algorithms is low these days, given the immense amount of work that has gone into various open-source ML libraries.
  • The models used in the talk are posted on GitHub - helpful gesture.

The Bad:
  • No data was shared. Attempts to ask for anonymized data sharing weren't welcomed. As a machine learning person (who likes open data sharing), I found myself confused at this behavior.
  • The shared code on GitHub is a few lines of Python code that basically calls some off-the-shelf ML algorithms, so really it is just code that calls some other code. Most ML people can write this code, so what's the value in open-sourcing such code? There's no algorithmic addition here. There's also no real software that's being open-sourced.
  • The talk is rated very highly by the non-ML audience, and people were only debating the web performance aspects of the conclusions. Barely anyone spoke up about the glaring modeling problems in the talk. May be it is not common to have ML folks in the audience.
  • This was a clear "vendor pitch" from Soasta that was very cleverly masked in the colorful clothes of ML word cloud. How did this get into Velocity that supposedly hates vendor pitches?

The Ugly:
  • What happened to the good-old multi-variate regression? Looking at the description of the dataset, I bet $0.02 that straight-up linear regression with multiple variables would have given them 80% accuracy with something to interpret.
  • Both the attempted models were complex models that aren't easy to interpret. They are also too complicated for what they are trying to do (hence the title of my blog post). Interpreting DNN variable importance is still an open question anyway (this talk didn't solve it). If you see the title of the talk, and you look at the two models they presented, you honestly wonder why they chose black-box-like models for their experiments.
  • Models didn't take into account correlation within input variables, and potential group structures that have strong influence on variable selection and variable importance calculations. This means that the conclusions they drew, and the variables they think are important - none of it is believable. Forget about what is conventional wisdom in web performance; the model selection methodology itself renders the conclusions of the talk to be moot.
  • There were no ROC curves or accuracy numbers. Really, there was ZERO information in the talk that a machine-learning enthusiast can use to convince him or herself that these results can be trusted.
  • The talk title and some of the presentation tried to make causal associations but the methods used in this talk absolutely cannot provide such insights.
  • It would have been at least interesting if they tried Random Forest model on the top-K features found through the gini-index (that I am guessing they used for variable selection). If they wanted to say here are top-6 features, they should have shown how the restricted 6-variable model behaves. It's a very basic and fixable mistake. I have communicated these thoughts with the speakers already - so hopefully there will be progress on this front.

I could go on further about other nitty-gritty, but let me stop here and summarize. I saw a first-ever ML talk at Velocity that didn't really teach me anything about web performance despite analyzing a million sessions worth of data.  The talk generated a lot of buzz on the basis of disrupting conventional wisdom while the models are highly questionable. I think what allowed the talk to fly through is the use of over-engineered and overly-complicated ML pipelines that aren't interpretable. This is the class of ML that I am going to call "Rube Goldberg Machine Learning" from now on.

As a recent ICML talk title says: "Friends Don’t Let Friends Deploy Models They Don’t Understand".
1 Comment

    Opinions / Thoughts / Ramblings - all personal.

    Archives

    October 2016
    July 2016
    June 2016
    May 2016

    Categories

    All
    AI
    Computer Vision
    Machine Learning
    Science
    Statistics
    Technology
    User Experience
    Web Performance

    RSS Feed

Proudly powered by Weebly
  • About
  • News
  • Research
  • Blog
  • Code