Projects/Where Does My Money Go/Store
From Open Knowledge Foundation
Contents |
WDMMG Store (DB/Backend)
Introduction
With WDMMG we face a significant challenge:
- We receive data from many sources
- The data is often partially aggregated
- For us to do something useful it (e.g. visualize it) we want to present it in a standard, structured way
Examples
- We want regional expenditure on building hospitals
- PESA gives total expenditure by service area, department and region but usually not by all 3.
Use Cases
A Basic Model
See also:
- Rufus' post in Oct 2009: http://lists.okfn.org/pipermail/okfn-help/2009-October/000361.html
- Alistair's post in Feb 2010: http://lists.okfn.org/pipermail/wdmmg-discuss/2010-February/000038.html
Important properties:
- No "data loss" condition (be able to reconstruct any dataset we enter after we enter it)
- Support for "complete" and "non-overlapping" datasets
The Model
File:P$$Where Does My Money Go$$Store$domain model.png
We base ourselves on standard accounting model (see below). Core objects are:
- Account (id, title, parent_id)
- Transactions: (id, date, notes)
- Posting: id, account_id, transaction_id, currency, amount
For short-hand the Account-Transaction-Posting (double-entry) Model will frequently be abbreviated ATP.
We shall allow arbitrary metadata to be added to this object and others using a keyvalue object:
- KeyValue: (namespace, object_id, key, value). Value will be a json-encoded string.
Source Information
It is essential that we record the "source" for each piece of information in the system. Strictly source can be (at least) two things here:
- "Who" entered the data
- "Where" (they claimed) it came from in terms of some source dataset (e.g. PESA, CRA ...)
For item 2 see next item on Slices.
Slices
It is often important to record that a given set of ATP are "non-overlapping" (and possibly also "complete"). To do this we shall allow ATP items to be associated to a "Slice". Note that any Account of Transaction may be associated with 1 Slice and one Slice only.
Slice attributes:
- complete: enumeration (y,n -- more maybe to come)
- non_overlapping: enumeration (ditto)
- notes
Classifiers: Tags, Categories
Given that we have arbitrary metadata it is not clear whether we need separate classifiers.
Classifiers would attach to Accounts and, possibly, transactions. A classifier would have following properties:
- account_id (transaction_id)
- name
- description
Miscellaneous: e.g. User
User interface/API
- User
Open questions
- Versioning and permissions
- Handling duplicates
Standard Accounting
'Objects':
- Accounts: representing places money go to and from
- Transactions (possibly split) which represent the transfers between accounts. Often subdivided into (subdivision allows e.g. split transactions):
- Transaction (aka Journal): a single unit of of account work (a given "transaction") made up of 2 or more postings
- Posting (aka Ledger): a single transfer to a single account
More info:
- http://homepages.tcp.co.uk/~m-wigley/gc_wp_ded.html
- good
- covers representation in sql db
- journal, posting, account, asset type (£,$,share X ...)
- http://www.accountingunplugged.com/2008/09/23/accounting-structure-quick-reference/
- good
- uses terminology of ledger, transaction and accounts
- http://c2.com/cgi/wiki?AccountingModeling
- http://c2.com/cgi/wiki?DoubleEntryBookkeeping
- also good though rather unorganized
From PESA
Based on an analysis of PESA data have the following 'objects':
- Entities: Central Government, Local Government, Public Corporations
- Central Government splits into depts
- Expenditure types:
- By service/function and subfunctions (using COFOG UN Classifications of Functions of Government mainly)
- By region/country
- Spending/Accounting limits for entities:
- DEL: Dept Exp Limit
- AME: Annually Managed Expenditure
- TME: total managed expenditure
- Procurement spending (what is this?) and many more ...
- Time (which period is this for ...)
Technologies
What technologies do we use in building the store?