FORMA

Forest Monitoring in Clojure

Dan Hammer // @econohammer

Deforestation accounts for 12% of annual greenhouse gas emissions

Deforestation is accelerating

rate

... and dispersing

entropy

Any viable effort to curb deforestation rates will rely on open information

gfw

Everything

from raw data to analysis to mapping to website

is open source

Map across partitions

of a one-dimensional array

ndvi
ndvi-break
ndvi-break-2

(require '[incanter.core :as i])

(defn trend-mat
  [T]
  (i/bind-columns (repeat T 1) (range T)))

(defn pseudoinverse
  [x]
  (let [xt (i/trans x)]
    (i/mmult (i/solve (i/mmult xt x)) xt)))
			           

(defn grab-trend
  [pseudo-mat sub-ts]
    (second (i/mmult pseudo-mat sub-ts)))

(defn windowed-trend
  [block-len full-ts]
  (let [pseudo-mat (-> block-len trend-mat pseudoinverse)]
    (map (partial grab-trend pseudo-mat)
         (partition block-len 1 full-ts))))
			           

(defn short-stat
  [block-param ts]
  (->> (windowed-trend block-param ts)
       (reduce min)))


(facts
  (count ndvi) => 199
  (short-stat 24 ndvi) => -63.3346)
			           

Same framework is extensible to more complex operations on 1D arrays

for applications in information theory

(defn- ordinal-idx
  [sub-ts]
  (let [indexed-vector (map-indexed vector sub-ts)]
    (map first
         (sort-by second indexed-vector))))
			           

(defn permutation-count
  [D ts]
  (let [subs (partition D 1 ts)]
    (frequencies (map ordinal-idx subs))))

(fact
  (permutation-count 3 [ 8 7 8 9 9 8 5 7 6 5 ])
    => \{ (1 0 2) 1
          (0 1 2) 2
          (2 0 1) 1
          (2 1 0) 2
          (1 2 0) 1
          (0 2 1) 1 \})
			           

Map across partitions

of a two-dimensional array

raster
raster

(defn walk-matrix
  [m window]
  (mapcat (comp (partial apply map vector)
                (partial map (partial partition window 1)))
          (partition window 1 m)))
			           

      (def mat
        [[ 0    0    0    0    0    0    0    1    1    1 ]
         [ 0    0    0    1    0    0    0    0    0    1 ]
         [ 0    0    0    0    0    0    0    0    0    0 ]
         [ 0    0    0    0    0    0    0    0    0    0 ]
         [ 0    0    0    0    0    0    0    0    0    0 ]
         [ 0    0    0    0    0    0    0    1    1    0 ]
         [ 0    0    0    0    0    0    0    1    1    0 ]
         [ 1    0    0    0    0    0    0    1    0    0 ]
         [ 0    1    0    0    0    0    0    0    0    0 ]
         [ 1    1    0    0    0    0    0    0    0    0 ]])

(defn window-sum
  [sub-mat]
  (reduce + (flatten sub-mat)))
			           

(fact 
  (map window-sum
       (walk-matrix mat 3))

    => [ 0 1 1 1 0 1 2 4
         0 1 1 1 0 0 0 1
         0 0 0 0 0 0 0 0
         0 0 0 0 0 1 2 2
         0 0 0 0 0 2 4 4
         1 0 0 0 0 3 5 5
         2 1 0 0 0 2 3 3
         4 2 0 0 0 1 1 1 ])
			           

raster
raster

Map across partitions

of a three-dimensional array

cube

Map across partitions

of an n-dimensional array

joke

I guess it's possible...

Estimation

  1. Collect training data into a single vector, where each element \( z_i \) indicates the event of deforestation in pixel \( i \)


  2. Collect all characteristics of the time series spanning the training period for each pixel \( i \) in \( {\bf v}_{i} \).


  3. Estimate \( \beta \) according to the logistic model:

    \[ \mathbb{P}(z_{it} \hspace{2pt} | \hspace{2pt} {\bf v}_{it}; \beta) = \frac{e^{\beta{\bf v}_{it}}}{1 + e^{\beta{\bf v}_{it}}} \]

Using Cascalog

to collect time series attributes and classify pixels

solved our problems

(use 'cascalog.api)

(defn beta-gen
  "query to return the beta vector associated with each ecoregion"
  [label-src timeseries-src]
  (<- [?eco ?beta]
      (label-src ?pixel-id ?label)
      (timeseries-src ?pixel-id ?eco ?ndvi)
      (create-features ?ndvi :> ?feature-vec)
      (estimate-beta ?label ?feature-vec :> ?beta)))
			           

(defn prob-gen
  [label-src timeseries-src]
  (let [beta-src (beta-gen label-src timeseries-src)]
    (<- [?pixel-id ?prob]
        (beta-src ?eco ?beta)
        (timeseries-src ?pixel-id ?eco ?ndvi)
        (create-features ?ndvi :> ?feature-vec)
        (apply-beta ?feature-vec ?beta :> ?prob))))

(fact
  (prob-gen labels timeseries) => (produces [[10029102 0.56]
                                             [12323943 0.24]]))
			           

Loading the data into CartoDB

using cartodb-clj

(use '[cartodb.playground :only (insert-rows)])

(let [table-name "gfw"
      account-name "wri"]
  (apply insert-rows account-name creds table-name data))
			           

Thanks!

github.com/danhammer/forma-js