Skip to content

Instantly share code, notes, and snippets.

@ashaw
Last active January 3, 2016 04:19
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ashaw/d13978725144a8d05ec4 to your computer and use it in GitHub Desktop.
Save ashaw/d13978725144a8d05ec4 to your computer and use it in GitHub Desktop.

Mozilla OpenNews Onboarding Hack Day

Make a news app. In 2 days.

Introduction

What makes it a news app?

  • Telling a story with software, and software that generates stories
  • Readers, not just users — help people find their story in larger stories
  • Impact

Apps generating stories http://projects.propublica.org/schools/schools/63441005650#63441007350,63441007845,63441007352,63441001276,63441005602

http://shaw.al.s3.amazonaws.com/opennews/polltracker.png

stories generating apps generating stories: https://speakerdeck.com/a_l/caf-seminar-quito-2013?slide=41

How to create a news app. The short version.

  1. Acquire data
  2. Clean/bulletproof data
    • find stories/trends in data
    • look for related datasets
    • Do additional reporting
  3. Import data
  4. Design and build app
    • Create graphics for app “lede”
  5. Deploy app

It’s doable!

The ProPublica Way

  • We’re all generalists (designer/developer/reporter)
  • Fakey Agile
  • We use tickets on sprawling projects and deadlines
  • An app has a “captain,” but pulls in others as necessary
  • Rigorous editorial and fact-checking process
  • Adaptive to responsive, as app requires, but not religious
  • Bylines!

Acquire data (in order of difficulty)

  • Find data on public data site such as data.gov
  • Request data from an agency, and receive it in a usable format
  • Scrape data from a public website and store it in your favorite format
  • Request data from an agency, and transform it from a hostile format
  • Create your own dataset because the data you want does not exist

Overcoming data acquisition problems

  • Find the nerds, not the PR office
  • Look in the metadata
  • FOIA

Clean/bulletproof data

Key idea: Look for "preposterousness"

  • Counts and totals
  • Limits of Excel & MySQL
  • Absurd max/mins
  • Blanks vs. nulls
  • Misspellings
  • Data Types (ask for a record layout)
  • Bad geocoding, duplicate city names
  • Check against reports and hard copies
  • Call, don't assume
  • Do random spot checks

Jen LaFleur's guide to Bulletproofing https://github.com/propublica/guides/blob/master/data-bulletproofing.md

Create an app!

rails new yourapp

If you use tabletop + Google Docs, be careful of Google’s arbitrary login walls.

Create your schema

  • Try to mimic the schema of the data you’re using
  • Find a record layout
  • Take note of decimal precision
  • ZIP codes are strings; some start in 0
  • Latlongs don’t need 15 units of precision; at that point you’re mapping atoms.

Example record layout: http://shaw.al.s3.amazonaws.com/opennews/nfhl-record-layout.png

Migrations: http://guides.rubyonrails.org/migrations.html

Import your data

Rake: http://rake.rubyforge.org/

Keep track of your changes with git: http://git-scm.com/

Find related datasets

  • What can be joined on your dataset? Sometimes the news is in the join.

Examples:

Just because you can join doesn’t mean you should. Focused apps are better than sprawling ones.

Find stories/trends in data, the app front page “lede”

  • The “far”
  • Look at maximums/minimums, clumps, outliers
  • Look at correlations
  • Geographic trends
  • Break down by states or brand
  • Let people sort
  • Show people what to look for in individual views

Varieties of “far”

Inspiration: http://collection.marijerooze.nl/

Tools:

Design and Build app

Example:

def search
  q = ActiveRecord::Base::sanitize(params[:q])
  drugs = Drug.where("name LIKE ?", "%#{params[:q]}%").limit(50);
  render :json => drugs.to_json
end

Deploying your app

Rough Schedule

Day 1

  • Introduction
  • Scrum/discussion of ideas
  • Claim specialties (1-2 people each). Possibles:
    • App/backend
    • Design/JS/CSS
    • Front page “lede” graphic/far view
    • Reporting, data analysis, guff-writing
    • Other specialties?
  • Working time
  • End of day scrum

Day 2

  • Scrum
  • Working time
  • Gather around lunchtime for editing/critique
  • Working time
  • Deployment — if we finish
  • Presentation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment