Hacker News new | past | comments | ask | show | jobs | submit login
This Is What Happens When Publishers Invest In Long Stories (fastcolabs.com)
110 points by crisnoble on Aug 11, 2013 | hide | past | favorite | 37 comments



The site appears to have Google Analytics placed multiple times into its source code. This messes up tracking and artificially dramatically drops bounce rates.

In my experience working with content sites, the real bounce rate reduction you see in moving from shorter articles to longer ones is closer to 5% or so - from, say, 77% to 72%. Still dramatic, but not nearly as dramatic as the drop you'll get by double-inserting GA code.


I bet their automated system is re-inserting GA code every time it updates their stub. It probably copies the entire previous stub, GA code and all, then publishes that plus updates and fresh GA code.

From the article:

Stub stories work like this: You write the first installment like any other story. But when more news breaks, you go back to the article, insert an update at the top, and change the headline and subheadline to reflect the update. Our system updates the story "slug" when the headline changes--check the URL of this story, and you'll see words from the headline in the URL: /this-is-what-happens-when-publishers-invest-in-long-stories. But the number preceding the slug--on this article, it's 3009577--is a unique node ID which never changes. So essentially, every time we update an article, we get a fresh URL with a fresh headline, but pointing back to the same (newly updated) article.

We can check this by checking older stubs.

Edit: on checking the page source, I only see GA in there 2x - once to call a GA API from google-analytics.com, and once more to (I think) load ga.js from google-analytics.com. How many times do you see that page calling GA?


I'm not seeing multiple inserts per page, and I checked their 'stub' pages as well. They only have GA in the head.


_utm.gif is apparently requested 3 times with the same google analytics account id.


In Chrome's F12 tools I see three _utm.gif requests from the article itself. According to http://utmgifparser.appspot.com/, the three requests appear identical except for the "X10 data" ('utme'); not sure of the significance of that w.r.t. GA reporting. Any GA experts care to enlighten?


_utm.gif is the generic way that Google Analatics communicates back with its servers---there's no rest API.

So if you're doing things like event tracking, without counting the event as a page view, then you'll still see another request to _utm.gif

They appear to have one normal _trackPageView call in the header, then one or more other calls coming from within minimized JavaScript files.

Here's one:

     _gaq.push(["_trackEvent", e, t, n.toString(), parseInt(r, 10), !0])
This is the signature for _trackEvent

    _trackEvent(category, action, opt_label, opt_value, opt_noninteraction)
!0 evaluates to true in javascript so this event is marked as non-interactive, and thus doesn't count as an extra page view and won't be used in bounce-rate calculations.

Presuming the other calls are identical, then their reported bounce rate should be correct.


Looking at the two additional requests, they do currently have the utmni parameter, marking them as non-interactive.

However, the top comment references Google Analytics events on scroll that don't appear to be there now. One of FastCo.Labs folks also made the following comment:

  This made me curious as well. It turns out there was a
  technical change made at around the same time that is
  going to account for some of it. We're running an
  experiment now and will update with results.
I don't see the update from him. Given all of this, I would guess that they were triggering events incorrectly and have since fixed it (the article is 3 months old).


That's extremely embarrassing.


Are you saying that as a hypothesis? Is there a link you can point us to where the code is inserted multiple times?


Agreed. Title should be "If it seems too good to be true (Jaw dropping improvement in bounce rate) then it probably is."


Since I became wedded to my Kindle for consuming journalism thanks to the superb sites such as http://longreads.org and http://longform.org * my reading time has exploded, plus I pay for services where possible as I am in need of their curation. Since using these sites I feel better informed about the world and I have a deep, deep respect for longform journalists. I'd love to see this stub idea extend to the Kindle being able to manage updates - it seems to be an exiting idea for readers and journalists alike and I'd willingly pay a subscription.

*fixed url typo

Also - these sites can send articles directly to my Kindle email, or I use the Readability plugin to right-click on any url and have the contents delivered on demand.


Yup, I feel the same way. With Instapaper and my Kindle I'm spending a lot more time reading articles.

If you're interested, I recently started curating my picks here: http://esd.io/worthreading/ Some I got from longform/longreads, others I "discovered" on my own.


You made a typo for longform's URL

    s/longform.com/longform.org/


I love that they both have RSS feeds. Both of which I am now following.


I love it that these guys are experimenting with content and analytics but the stats look contradictory.

Bounce rate is 1- n_visits_with_more_than_one event/n_visits.

The default GA setup doesn't register any events except page views so "more than one event" means more than one page view.

However the charts in this article showed that while the bounce rate fell, the average pages per visit stayed constant. For bounce rate to drop you'd expect pages per visit to rise... unless there's also a new event being fired.

My guess (as suggested by some commenters here before me) is that they shipped some type of event which fires for a fraction of visitors - scrolling a certain distance down the page perhaps.

If this event fires, the visit won't (by default) count as a bounce. it will also effect time-on-site. Time on site is normally zero unless you visit another page or fire an event. In this case it's determined by the time of the last page view or event that google registers. If they did ship an "on 400px scroll" event (for instance) then for each visitor for whom the event fires for, the "time on site" will be registered as the time of the event.

For my money, I think that bounce rates for content-driven sites should actually be set by a threshold time-on-site. Most visitors don't read more than one page so it's a very weak indicator of a "bounce". Someone could read for 10 minutes and still count as a bounce which isn't a fair appraisal of a very successful article. The raw "bounce metric" makes sense for a transactional site where onepage view == no Revenue but makes little sense for a site where one ten minute read of an article is exactly what the author hopes for.


A content site wants them to explore and traverse the rest of the site.

An article that gets read is a success for the article, not for the publisher.


Previous discussion of this item: https://news.ycombinator.com/item?id=5689157


There's obviously something wrong with the numbers. It's too dramatic of a change, both in terms of magnitude and in immediacy, for it to be due to a simple content tweak.

Pages/Visit is stable yet Bounce Rate suddenly dropped significantly.


It is critical to understand what you are measuring. I see misunderstanding constantly, particularly with GA. Were events being fired prior, because they are now. That alone could account for this behavior.

There are many nuances in regard to how metrics are measured in GA. It is often the case that people think it is measuring one thing, when it is measuring something completely different. I've posted this before, but this GitHub project explains/addresses some of it: https://github.com/rockymadden/gap


Is it necessary to change the URL with each new addition (while keeping that ID the same)? I didn't really get that part.

If so, is there any Wordpress plugin that helps you do that, or how could you achieve that?


Having a different URL for each update would allow them to track readers' entry points into the article at a finer-grained level than if there was just one URL.

Not sure about plugins for Wordpress but I wouldn't be surprised if someone's already started working on one.

Regarding the multiple URLs I wonder if it is more just so that they can be posting what look like new stories that ultimately just lead back to these long form story series.


The reference number seems to act as a permalink, so

http://www.fastcolabs.com/3009577/

leads to the original article, and I assume will continue to link to the article as they add content.

I wonder how the adding of new content rather than the reorganisation and re-editing of the article will work from the point of view of prose style?


But isn't that the same as the automatic redirect you have in Wordpress when you change the URL of the post anyway? I'm trying to understand if there's any advantage to having this number stub over how WP redirects the URL's, but I don't think there is one. You're still going to get redirected links either way.


Even assuming the analytics are correct, what is the value in driving down bounce rate? Lower bounce rate doesn't inherently mean more revenue.

I can, of course, use my imagination. But I am more interested in reading why a lower bounce rate is the future for media revenue. The quote near the end from Vox CEO is the best part of the article, but even it doesn't make the bounce = revenue connection explicit.

I will also add that, if those analytics are correct, that is a crazy-low bounce rate. A 50-60% bounce rate seems super strong to me. Dipping well below 50% seems unlikely for any site with significant traffic coming from disparate sources.


How exactly do they track if people actually read the entire article? Merely based on time?


You just made me realise why Google Glasses et al are such potential money makers.


Because unknown parties will see whatever you see?


They can verify that you saw their ad/content.


Google Analytics can track pages and average time spent on page. It can even track visitors flow.

Everyone should get an analytics account and see the insane amount of things that Google can track for YOU, for FREE with just three lines of extra code on your page.


It ONLY tracks the latter if they click on another page. A huge difference.


I actively block GA for that reason.


Nobody knows if you're actually reading the words, but based upon whether you scroll through the entire content, and possibly factoring in how quickly you scrolled, they could have an educated guess. If you didn't scroll at all, you probably read the first paragraph, if that. If you scrolled quickly to the bottom, you probably skim-read. If you scrolled to the bottom with a lot of pauses and it took minutes, you probably read the whole thing.


There are lots of ways of structuring the code, but usually by checking scrolldepth.

If you want to add this to your own site, checkout jQuery ScrollDepth: https://github.com/robflaherty/jquery-scrolldepth


They should put an innocuous call to action at the very end of the story to help track this.


Quote: "So, it's like having many URLs and many headlines which lead back to the same big, multi-faceted article."

Translation: old wine in new bottles.


So much for ever shortening attention spans.


Good riddance.

If this becomes a business case to bring back something closer to real journalism, it is a good day. The NSA/prism story is really well-suited to this format. Connect the Guardian interview to administration concessions to POTUS conference. Would have been more valuable to the public than the soap-opera story that developed around Snowden's asylum seeking.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: