3 | {% blocktrans trimmed %}
4 | This graph was obtained using all the documents available on the official website
5 | of the portuguese law, under Series I. Currently they are about ~200.000.
6 | {% endblocktrans %}
7 |
8 |
9 | {% blocktrans trimmed %}
10 | For simplicity, we define "Law" to be anything that is "officially published in the Series I
11 | of Diário da República".
12 | {% endblocktrans %}
13 |
14 |
--------------------------------------------------------------------------------
/main/tools/set_up.py:
--------------------------------------------------------------------------------
1 | def set_up_django_environment(settings):
2 | import os
3 | import sys
4 | import django
5 |
6 | # Adds the previous path to the working path so we have access to 'contracts'
7 |
8 | path = os.path.abspath(os.path.join(os.path.dirname(__file__), '../../'))
9 | if path not in sys.path:
10 | sys.path.insert(1, path)
11 | del path
12 |
13 | # Sets Django settings to be the settings in this directory
14 | os.environ['DJANGO_SETTINGS_MODULE'] = settings
15 |
16 | django.setup()
17 |
--------------------------------------------------------------------------------
/main/apache/wsgi.py:
--------------------------------------------------------------------------------
1 | import os
2 | import sys
3 |
4 | #path to directory of the .wgsi file ('[directory]/')
5 | wsgi_dir = os.path.abspath(os.path.dirname(__file__))
6 |
7 | # path to project root directory (osqa '/')
8 | project_dir = os.path.dirname(wsgi_dir)
9 |
10 | # add project directory to system's Path
11 | sys.path.append(project_dir)
12 | sys.path.append(project_dir+"/main")
13 |
14 | os.environ['DJANGO_SETTINGS_MODULE'] = 'main.settings'
15 | from django.core.wsgi import get_wsgi_application
16 | application = get_wsgi_application()
17 |
--------------------------------------------------------------------------------
/law/templates/law/law_list/search_law.html:
--------------------------------------------------------------------------------
1 | {% load i18n %}
2 |
3 | {% include "contracts/entity_list/search_entities.html" %}
4 | {% if entities %}
5 | {% include "pagination.html" with items=entities%}
6 | {% endif %}
7 |
8 |
9 | {% for entity in entities %}
10 | {% include "contracts/entity_view/tab_costumers/costumer_summary.html" %}
11 | {% endfor %}
12 |
13 | {% if entities %}
14 |
15 | {% include "pagination.html" with items=entities %}
16 |
17 | {% endif %}
18 |
19 |
--------------------------------------------------------------------------------
/law/indexes.py:
--------------------------------------------------------------------------------
1 | from django.db.models import Value, F, TextField
2 | from django.db.models.functions import Concat
3 |
4 | from sphinxql import indexes, fields
5 | from .models import Document
6 |
7 |
8 | class DocumentIndex(indexes.Index):
9 | name = fields.Text(Concat(F('type__name'), Value(' '), F('number'),
10 | output_field=TextField()))
11 | summary = fields.Text('summary')
12 | text = fields.Text('text')
13 |
14 | class Meta:
15 | model = Document
16 | query = Document.objects.exclude(dr_series='II')
17 | range_step = 10000
18 |
--------------------------------------------------------------------------------
/law/tools/eu_impact.py:
--------------------------------------------------------------------------------
1 | if __name__ == "__main__":
2 | ## setup the Django with public settings for using the database
3 | from main.tools import set_up
4 | set_up.set_up_django_environment('main.tools.settings_for_script')
5 |
6 | from law.models import Document
7 | # prints the relative number of laws that have no text
8 | import datetime
9 | date = datetime.date(1986, 1, 1)
10 |
11 | documents = Document.objects.filter(date__gt=date)
12 |
13 | total = documents.count()
14 | actual = documents.filter(text=None).count()
15 |
16 | print(actual*1./total*100)
17 |
--------------------------------------------------------------------------------
/main/templates/500.html:
--------------------------------------------------------------------------------
1 | {% extends "base.html" %}
2 | {% block content %}
3 |
500: Erro de servidor
4 |
5 | Bem vindo à página 500 deste website. Entrou nesta página porque encontrou (e causou) um erro no servidor.
6 |
7 |
8 | Pedimos desculpa pelo que aconteceu. Recebemos um relatório do erro, e vamos fazer os possíveis
9 | para que o erro seja resolvido o mais rapidamente possível.
10 |
11 |
12 | Obrigado! Esta é uma forma eficiente (ainda que desagradável para si) de encontrar erros neste website.
13 |
13 | {% endblock %}
14 |
--------------------------------------------------------------------------------
/docs/deputies_api/party.rst:
--------------------------------------------------------------------------------
1 | Party
2 | =====
3 |
4 | .. _parliament website: http://www.parlamento.pt
5 |
6 | This document provides the API references for the parties in the database.
7 |
8 | .. currentmodule:: deputies
9 |
10 | API
11 | ----
12 |
13 | .. class:: models.Party
14 |
15 | A party, formally known as a Parlamentary Group, is required to have a mandate in the parliament.
16 | Here is just a category of the mandate.
17 |
18 | All fields of this model are retrieved from `parliament website`_ using a crawler.
19 |
20 | .. attribute:: abbrev
21 |
22 | The abbreviated name of the party.
23 |
24 | .. method:: get_absolute_url
25 |
26 | The url for its view on the website.
27 |
--------------------------------------------------------------------------------
/contracts/tools/example.py:
--------------------------------------------------------------------------------
1 | # these first two lines are used to setup a minimal Django environment
2 | from main.tools import set_up
3 | set_up.set_up_django_environment('main.settings_for_script')
4 | # From this point on, we are ready to use Django for accessing the remote database.
5 |
6 | from contracts import models
7 | from django.db.models import Sum
8 |
9 | # this query asks for all contracts in the database
10 | all_contracts = models.Contract.objects.all()
11 |
12 | # this query counts the previous query
13 | number_of_contracts = all_contracts.count()
14 | print(number_of_contracts)
15 |
16 | # this query sums the prices of all contracts.
17 | total_price = all_contracts.aggregate(sum=Sum('price'))['sum']
18 | print(total_price)
19 |
--------------------------------------------------------------------------------
/travis_test.sh:
--------------------------------------------------------------------------------
1 | #!/bin/sh
2 | set -e
3 |
4 | if [ $CONFIGURATION = "API" ];
5 | then
6 | # install
7 | pip install -r api_requirements.txt
8 |
9 | # script
10 | python -m contracts.tools.example
11 |
12 | elif [ $CONFIGURATION = "WEBSITE" ]
13 | then
14 | # install
15 | pip install -r website_requirements.txt
16 |
17 | # script
18 | python -m contracts.tools.example
19 | # todo: add test to `python manage.py runserver`
20 |
21 | elif [ $CONFIGURATION = "PRODUCTION" ]
22 | then
23 | # install
24 | pip install -r production_requirements.txt
25 |
26 | # script
27 | python -m contracts.tools.example
28 | mkdir cached_html
29 | PYTHONPATH=$PYTHONPATH:`pwd` coverage run `which django-admin.py` test --settings=main.settings_test law.test contracts.test
30 | fi
31 |
--------------------------------------------------------------------------------
/law/analysis/__init__.py:
--------------------------------------------------------------------------------
1 | from main.analysis import AnalysisManager, Analysis
2 |
3 | from law.analysis.analysis import get_documents_time_series,\
4 | get_eu_impact_time_series,\
5 | get_types_time_series
6 |
7 |
8 | _allAnalysis = [
9 | Analysis('law_count_time_series', get_documents_time_series),
10 | Analysis('law_eu_impact_time_series', get_eu_impact_time_series),
11 | Analysis('law_types_time_series', get_types_time_series)]
12 |
13 | ANALYSIS = {'law_count_time_series': 1,
14 | 'law_eu_impact_time_series': 2,
15 | 'law_types_time_series': 3}
16 |
17 | PRIMARY_KEY = dict()
18 | for k, v in ANALYSIS.items():
19 | PRIMARY_KEY[v] = k
20 |
21 | analysis_manager = AnalysisManager()
22 | for x in _allAnalysis:
23 | analysis_manager.register(x, primary_key=ANALYSIS[x.name])
24 |
--------------------------------------------------------------------------------
/law/templates/law/analysis/laws_time_series.html:
--------------------------------------------------------------------------------
1 | {% extends "law/analysis/base.html" %}
2 | {% load i18n %}
3 | {% load static %}
4 |
5 | {% block analysis_content %}
6 |
{% blocktrans trimmed %}
7 | We picked all documents published in the Series I of "Diário da república online", counted how many
8 | were published in each year. Here is the result:
9 | {% endblocktrans %}
10 |
18 |
--------------------------------------------------------------------------------
/docs/api.rst:
--------------------------------------------------------------------------------
1 | API Reference
2 | =============
3 |
4 | This part of the documentation focus on the API for accessing and using the database.
5 | It documents what objects the database contains, and how you can interact with them.
6 |
7 | Publics uses Django ORM to build, maintain and query a Postgres database.
8 | This has two immediate consequences:
9 |
10 | 1. If you don't know python/Django but you know SQL, you can access it remotely (see :doc:`tools/database`).
11 | 2. If you don't know SQL, but you know Python and/or Django,
12 | you can take advantage of this API (see :doc:`/usage`)
13 |
14 | .. toctree::
15 | :maxdepth: 1
16 |
17 | contracts_api/contract
18 | contracts_api/entity
19 | contracts_api/category
20 | deputies_api/legislature
21 | deputies_api/deputy
22 | deputies_api/mandate
23 | deputies_api/party
24 |
--------------------------------------------------------------------------------
/docs/deputies_api/legislature.rst:
--------------------------------------------------------------------------------
1 | Legislature
2 | ===========
3 |
4 | .. _parliament website: http://www.parlamento.pt
5 |
6 | This document provides the API references for the legislatures in the database.
7 |
8 | .. currentmodule:: deputies
9 |
10 | API
11 | ----
12 |
13 | .. class:: models.Legislature
14 |
15 | A legislature is a time-interval between two elections.
16 |
17 | All the fields of this model are retrieved from `parliament website`_ using a crawler.
18 |
19 | .. attribute:: number
20 |
21 | The official number of the legislature.
22 |
23 | .. attribute:: date_start
24 |
25 | The date corresponding to the beginning of the legislature.
26 |
27 | .. attribute:: date_end
28 |
29 | The date corresponding to the end of the legislature.
30 | It can be null when is an ongoing legislature.
31 |
--------------------------------------------------------------------------------
/contracts/templates/contracts/tender_list/tender_summary.html:
--------------------------------------------------------------------------------
1 | {% extends "summary.html" %}
2 | {% load i18n %}
3 | {% block summary_heading %}
4 | {{ tender.description }} (BASE)
5 | {% endblock %}
6 | {% block summary_body %}
7 |
8 | {% include "contracts/tender_list/colored_price.html" with price=tender.price %}
9 |
10 |
11 | {{tender.publication_date}} ({% trans "deadline" %}: {{tender.deadline_date}})
12 |
13 |
14 | {% for contractor in tender.contractors.all %}
15 | {% include "contracts/entity_inline.html" with entity=contractor %}
16 | {% endfor %}
17 | {% endblock %}
18 |
--------------------------------------------------------------------------------
/main/templates/404.html:
--------------------------------------------------------------------------------
1 | {% extends "base.html" %}
2 | {% block content %}
3 |
404: Página não encontrada
4 |
5 | Bem vindo à página 404 deste website. Entrou nesta página porque, por que quer que tenha feito,
6 | requeriu um url inexistente do website.
7 |
8 |
9 | Pedimos desculpa por cá se encontrar. Se isto foi causado por ter feito uma acção normal no website (e.g.
10 | ter clicado num link), pediamos-lhe que nos indicasse onde clicou (que o fez vir parar a esta página)
11 | através da da nossa mailing list.
12 |
13 |
14 | O seu feedback é importante para que esta página não volte a aparecer a si nem a outros utilizadores.
15 |
16 |
17 | Obrigado!
18 |
19 | {% endblock %}
20 |
--------------------------------------------------------------------------------
/contracts/templates/contracts/contract_list/selector_template.html:
--------------------------------------------------------------------------------
1 | {% load i18n %}
2 | {% for field in form %}
3 |
29 | {% include "deputies/deputy_list_template.html"%}
30 |
31 | {% endblock %}
32 |
--------------------------------------------------------------------------------
/contracts/analysis/__init__.py:
--------------------------------------------------------------------------------
1 | from main.analysis import Analysis, AnalysisManager
2 |
3 | from contracts.analysis.analysis import *
4 |
5 |
6 | _allAnalysis = [
7 | Analysis('municipalities_contracts_time_series',
8 | municipalities_contracts_time_series),
9 | Analysis('excluding_municipalities_contracts_time_series',
10 | exclude_municipalities_contracts_time_series),
11 | Analysis('municipalities_procedure_types_time_series',
12 | municipalities_procedure_types_time_series),
13 | Analysis('procedure_type_time_series', procedure_type_time_series),
14 | Analysis('contracts_time_series', contracts_price_time_series),
15 | Analysis('contracts_statistics', contracts_statistics),
16 | Analysis('contracts_price_distribution', contracts_price_histogram),
17 | Analysis('ministries_contracts_time_series', ministries_contracts_time_series),
18 | Analysis('entities_values_distribution', entities_values_histogram),
19 | Analysis('contracted_lorenz_curve', contracted_lorenz_curve),
20 | Analysis('municipalities_ranking', municipalities_ranking),
21 | ]
22 |
23 |
24 | analysis_manager = AnalysisManager()
25 | for x in _allAnalysis:
26 | analysis_manager.register(x)
27 |
--------------------------------------------------------------------------------
/main/templates/base/search_js.html:
--------------------------------------------------------------------------------
1 | {% load i18n %}
2 |
37 |
--------------------------------------------------------------------------------
/docs/introduction.rst:
--------------------------------------------------------------------------------
1 | Publics
2 | =======
3 |
4 | .. _website: http://publicos.pt
5 | .. _parliament: http://parlamento.pt
6 | .. _law: http://dre.pt
7 | .. _Base: http://www.base.gov.pt/base2
8 |
9 | This place documents the backend of our website_.
10 |
11 | This backend provides an interface to access to three distinct portuguese public
12 | databases:
13 |
14 | - Public procurements (Base_)
15 | - MPs and parliament procedures (parliament_)
16 | - Law (law_)
17 |
18 | We build and maintain an open source website and API for querying these databases.
19 | Specifically, this project consists in three components:
20 |
21 | - A database in postgres and driven by Django ORM, remotely accessible.
22 | - An API for querying the database using Django and Python.
23 | - A website for visualizing the database and sharing statistical features of it.
24 |
25 | How can you use it?
26 | -------------------
27 |
28 | To navigate in the database and discover some of its features, you can
29 | visit the website_.
30 |
31 | .. _GitHub: https://github.com/jorgecarleitao/public-contracts
32 |
33 | To use the API, e.g. to :doc:`ask your own questions to the database `,
34 | you must first :doc:`install it `.
35 |
36 | To contribute to this API and/or website, see section :doc:`website/website`.
37 |
--------------------------------------------------------------------------------
/main/templates/pagination.html:
--------------------------------------------------------------------------------
1 | {% load i18n %}
2 | {% if items.paginator.num_pages != 1 %}
3 |
4 | {% if items.has_previous %}
5 | {% if items.previous_page_number != 1 %}
6 |
35 |
36 | {% endblock %}
37 |
--------------------------------------------------------------------------------
/deputies/views_data.py:
--------------------------------------------------------------------------------
1 | import json
2 |
3 | from django.http import Http404, HttpResponse
4 | from django.utils.translation import ugettext as _
5 |
6 | from .analysis import analysis_manager
7 |
8 |
9 | def deputies_time_distribution_json(request):
10 | data = analysis_manager.get_analysis('deputies_time_distribution')
11 |
12 | time_series = {'values': [], 'key': _('time in office')}
13 | for x in data:
14 | time_series['values'].append({'from': x['min'], 'value': x['count']})
15 |
16 | return HttpResponse(json.dumps([time_series]), content_type="application/json")
17 |
18 |
19 | def mandates_distribution_json(request):
20 | data = analysis_manager.get_analysis('mandates_distribution')
21 |
22 | histogram = {'values': [], 'key': _('deputies')}
23 | for x in data:
24 | histogram['values'].append({'mandates': x['mandates'], 'value': x['count']})
25 |
26 | return HttpResponse(json.dumps([histogram]), content_type="application/json")
27 |
28 |
29 | AVAILABLE_VIEWS = {
30 | 'deputies-time-distribution-json': deputies_time_distribution_json,
31 | 'mandates-distribution-json': mandates_distribution_json
32 | }
33 |
34 |
35 | def analysis_selector(request, analysis_name):
36 | if analysis_name not in AVAILABLE_VIEWS:
37 | raise Http404
38 |
39 | return AVAILABLE_VIEWS[analysis_name](request)
40 |
--------------------------------------------------------------------------------
/main/templates/base/search_form.html:
--------------------------------------------------------------------------------
1 | {% load i18n %}
2 |
21 |
--------------------------------------------------------------------------------
/contracts/test/test_signals.py:
--------------------------------------------------------------------------------
1 | from datetime import datetime
2 |
3 | import django.test
4 |
5 | from contracts.models import Contract, Entity
6 |
7 |
8 | class InvalidateEntityDataTestCase(django.test.TestCase):
9 |
10 | def setUp(self):
11 | self.c = Contract.objects.create(
12 | base_id=1, contract_description='da', price=100,
13 | added_date=datetime(year=2003, month=1, day=1))
14 | self.e1 = Entity.objects.create(name='test1', base_id=1, nif='nif')
15 | self.e2 = Entity.objects.create(name='test2', base_id=2, nif='nif')
16 |
17 | self.c.contractors.add(self.e1)
18 | self.c.contracted.add(self.e2)
19 |
20 | def test_add(self):
21 | self.assertFalse(self.e1.data.is_updated)
22 | self.assertFalse(self.e2.data.is_updated)
23 |
24 | def test_compute_data(self):
25 | self.e1.compute_data()
26 | self.e2.compute_data()
27 | self.assertTrue(self.e1.data.is_updated)
28 | self.assertTrue(self.e2.data.is_updated)
29 |
30 | def test_delete(self):
31 | self.e1.compute_data()
32 | self.e2.compute_data()
33 | self.c.delete()
34 |
35 | self.e1 = Entity.objects.get(id=self.e1.id)
36 | self.e2 = Entity.objects.get(id=self.e2.id)
37 |
38 | self.assertFalse(self.e1.data.is_updated)
39 | self.assertFalse(self.e2.data.is_updated)
40 |
--------------------------------------------------------------------------------
/deputies/urls.py:
--------------------------------------------------------------------------------
1 | from django.conf.urls import patterns, url
2 | from django.utils.translation import ugettext_lazy as _
3 |
4 | from . import views
5 | from . import views_data
6 |
7 |
8 | urlpatterns = patterns('',
9 | url(r'^%s/%s$' % (_('deputies'), _('home')), views.home, name='deputies_home'),
10 | url(r'^%s$' % _('deputies'), views.deputies_list, name='deputies_deputies'),
11 | url(r'^%s$' % _('parties'), views.parties_list, name='deputies_parties'),
12 | url(r'^%s/(\d+)$' % _('party'), views.party_view, name='party_view'),
13 |
14 | url(r'^%s/%s$' % (_('deputies'), _('analysis')),
15 | views.analysis_list, name='deputies_analysis'),
16 |
17 | url(r'^%s/%s/(\d+)/(\w+)' % (_('deputies'), _('analysis')),
18 | views.analysis, name='deputies_analysis_selector'),
19 | url(r'^%s/%s/(\d+)' % (_('deputies'), _('analysis')),
20 | views.analysis, name='deputies_analysis_internal_selector',
21 | ),
22 |
23 | url(r'^%s/%s/([-\w]+)/data' % (_('deputies'), _('analysis')),
24 | views_data.analysis_selector, name='deputies_data_selector',
25 | ),
26 | )
27 |
--------------------------------------------------------------------------------
/docs/deputies_api/mandate.rst:
--------------------------------------------------------------------------------
1 | Mandate
2 | =======
3 |
4 | .. _parliament website: http://www.parlamento.pt
5 |
6 | This document provides the API references for the mandates in the database.
7 |
8 | .. currentmodule:: deputies
9 |
10 | API
11 | ----
12 |
13 | .. class:: models.Mandate
14 |
15 | A mandate is a time-interval corresponding to a mandate in the parliament of a deputy.
16 | A mandate require always a district and a party.
17 |
18 | All the fields of this model are retrieved from `parliament website`_ using a crawler.
19 |
20 | .. attribute:: deputy
21 |
22 | The :class:`models.Deputy` of the mandate.
23 |
24 | .. attribute:: legislature
25 |
26 | The :class:`models.Legislature` of the mandate.
27 |
28 | .. attribute:: party
29 |
30 | The parlamentary group this mandate is respective to. A ForeignKey to :class:`models.Party`.
31 |
32 | .. attribute:: district
33 |
34 | The district this mandate is respective to. This is a ForeignKey to :class:`models.District`.
35 |
36 | The next two fields are required because some mandates end before the legislature ends.
37 |
38 | .. attribute:: date_start
39 |
40 | The date corresponding to the beginning of the mandate.
41 |
42 | .. attribute:: date_end
43 |
44 | The date corresponding to the end of the mandate.
45 | It can be null when is an ongoing legislature.
46 |
--------------------------------------------------------------------------------
/contracts/indexes.py:
--------------------------------------------------------------------------------
1 | from sphinxql import indexes, fields
2 | from sphinxql.manager import IndexManager
3 |
4 | from .models import Entity, Contract, Tender
5 |
6 |
7 | class EntityIndex(indexes.Index):
8 | name = fields.Text('name')
9 |
10 | class Meta:
11 | model = Entity
12 |
13 |
14 | class Manager(IndexManager):
15 | def get_queryset(self):
16 | return super(Manager, self).get_queryset()\
17 | .extra(select={'signing_date_is_null': 'signing_date IS NULL'},
18 | order_by=['signing_date_is_null', '-signing_date'])
19 |
20 |
21 | class ContractIndex(indexes.Index):
22 | name = fields.Text('description')
23 | description = fields.Text('contract_description')
24 |
25 | signing_date = fields.Date('signing_date')
26 |
27 | category_id = fields.Integer('category')
28 |
29 | category = fields.Text('category__description_pt')
30 |
31 | district = fields.Text('district__name')
32 | council = fields.Text('council__name')
33 |
34 | objects = Manager()
35 |
36 | class Meta:
37 | model = Contract
38 | query = Contract.default_objects.all()
39 |
40 |
41 | class TenderIndex(indexes.Index):
42 | description = fields.Text('description')
43 |
44 | category = fields.Text('category__description_pt')
45 |
46 | publication_date = fields.Date('publication_date')
47 |
48 | class Meta:
49 | model = Tender
50 |
--------------------------------------------------------------------------------
/docs/deputies_api/deputy.rst:
--------------------------------------------------------------------------------
1 | Deputy
2 | ======
3 |
4 | .. _parliament website: http://www.parlamento.pt
5 |
6 | This document provides the API references for the deputies in the database.
7 |
8 | .. currentmodule:: deputies
9 |
10 | API
11 | ----
12 |
13 | .. class:: models.Deputy
14 |
15 | A deputy is a person that at some point was part of the parliament.
16 |
17 | All the fields of this model are retrieved from `parliament website`_ using a crawler.
18 |
19 | .. attribute:: name
20 |
21 | The name of the person.
22 |
23 | .. attribute:: birthday
24 |
25 | Birthday of the person. May be null if it is not in the official database.
26 |
27 | .. attribute:: party
28 |
29 | A foreign key to the party the deputy belongs. This is a cached version, the correct
30 | version is always obtained from the mandate the deputy is.
31 |
32 | .. attribute:: is_active
33 |
34 | A bool telling if the deputy is active or not. This is a cached version, the correct
35 | value is always obtained from the last mandate the deputy is.
36 |
37 | .. method:: get_absolute_url()
38 |
39 | Returns the url of this entity in `parliament website`_.
40 |
41 | .. method:: update()
42 |
43 | Updates the :attr:`party` and :attr:`is_active` using the deputies' last mandate information.
44 | This only has to be called when the deputy has a new mandate.
45 |
--------------------------------------------------------------------------------
/contracts/templates/contracts/analysis/contracts_price_histogram/main.html:
--------------------------------------------------------------------------------
1 | {% extends "contracts/base.html" %}
2 | {% load i18n %}
3 | {% load contracts.humanize %}
4 | {% block title %}{% trans "Distribution of contract prices"%}{% endblock %}
5 | {% block content %}
6 |
7 |
{% trans "How many contracts are priced as X Euros?"%}
8 |
9 | {% blocktrans trimmed with count=count price=price|intword %}
10 | Portugal's official database has today {{ count }} contracts, with a total value of {{ price }} €.
11 | In this analysis, we asked:
12 | {% endblocktrans %}
13 |
14 |
{% trans "What is the distribution of prices of these contracts?" %}
15 | {% include "contracts/analysis/contracts_price_histogram/graph.html" %}
16 |
17 | {% blocktrans trimmed %}
18 | The graph represents the number of contracts (y-axis) within a price range (x-axis).
19 | We point out two caracteristics of the distribution:
20 | {% endblocktrans %}
21 |
22 |
23 |
24 | {% trans "The maximum is around 5000€-10.000€" %}
25 |
26 |
27 | {% trans "The distribution is broad, with contracts from euros to million of euros." %}
28 |
29 |
30 |
31 | {% endblock %}
32 |
--------------------------------------------------------------------------------
/contracts/management/commands/cache_contracts.py:
--------------------------------------------------------------------------------
1 | from django.core.management.base import BaseCommand
2 |
3 | from contracts.analysis import analysis_manager
4 | from contracts.models import Entity, Category
5 |
6 |
7 | class Command(BaseCommand):
8 | help = 'Computes and stores intermediary results (analysis, etc.).'
9 |
10 | def add_arguments(self, parser):
11 | parser.add_argument(
12 | '--all',
13 | action='store_true',
14 | help='Equivalent to add all options below.')
15 |
16 | parser.add_argument(
17 | '--entities',
18 | action='store_true',
19 | help='Recompute cache of entities.')
20 |
21 | parser.add_argument(
22 | '--categories',
23 | action='store_true',
24 | help='Recompute cache of categories.')
25 |
26 | parser.add_argument(
27 | '--analysis',
28 | action='store_true',
29 | help='Recompute cache of analysis.')
30 |
31 | def handle(self, **options):
32 | if options['entities'] or options['all']:
33 | for entity in Entity.objects.filter(data__is_updated=False):
34 | entity.compute_data()
35 |
36 | if options['categories'] or options['all']:
37 | for category in Category.objects.all():
38 | category.compute_data()
39 |
40 | if options['analysis'] or options['all']:
41 | for analysis in list(analysis_manager.values()):
42 | analysis.update()
43 |
--------------------------------------------------------------------------------
/contracts/templates/contracts/analysis/entities_values_histogram/main.html:
--------------------------------------------------------------------------------
1 | {% extends "contracts/base.html" %}
2 | {% load i18n %}
3 | {% load contracts.humanize %}
4 | {% block title %}{% trans "Distribution of earnings of entities"%}{% endblock %}
5 | {% block content %}
6 |
7 |
{% trans "How many entities earned/spent X Euros?"%}
8 |
9 | A base de dados oficial de contratos públicos contém hoje milhares de contratos e entidades.
10 | Nesta análise pretendemos conhecer a distribuição de faturação e despesas das entidades, i.e,
11 | quantas entidades ganharam/gastaram X euros:
12 |
13 | {% include "contracts/analysis/entities_values_histogram/graph.html" %}
14 |
15 | O gráfico contém todas as entidades até hoje (atualizado diariamente)
16 | e foi construído contando o total de faturação/despesa de cada entidade.
17 | As barras correspondem ao número de entidades que faturação/despesa um valor em €
18 | apresentado em cada barra (i.e. um histograma).
19 |
20 |
21 |
22 | Entidades que gastaram e ganharam são contabilizadas em ambos os gráficos.
23 |
24 |
25 | Note que os valores no eixo do x estão em escala exponencial.
26 |
27 |
28 | Pode filtrar os gráficos clicando nos círculos respectivos.
29 |
{% trans "When do portuguese ministries contract most?" %}
8 |
9 | {% blocktrans trimmed with url="http://www.base.gov.pt/base2/" %}
10 | By European law, portuguese public entities must publish their contracts in an
11 | official public database. With this database,
12 | we posed the question: when do portuguese ministries contract most?
13 | {% endblocktrans %}
14 |
15 |
16 | {% blocktrans trimmed %}
17 | We picked all contracts of all ministries and grouped then by the month.
18 | Here is the result:
19 | {% endblocktrans %}
20 |
21 |
22 | {% url 'contracts_data_selector' 'ministries-contracts-time-series-json' as the_url %}
23 | {% include "contracts/analysis/contracts_time_series/graph.html" with url=the_url %}
24 |
25 |
26 | {% blocktrans trimmed %}
27 | By ministry we mean entities whose name starts with "Secretaria-Geral do Ministério ...".
28 | {% endblocktrans %}
29 |
30 | {% endblock %}
31 |
--------------------------------------------------------------------------------
/docs/website/organization.rst:
--------------------------------------------------------------------------------
1 | Organization of the code
2 | ========================
3 |
4 | .. _django-rq: http://python-rq.org/
5 |
6 | Django apps
7 | -----------
8 |
9 | The code is organized as a standard Django website composed of 4 apps:
10 |
11 | * main: delivers ``robots.txt``, the main page, the about page, etc;
12 | * contracts
13 | * deputies
14 | * law
15 |
16 | The rest of this section, ``apps`` refer to all aps except ``main``. ``main``
17 | is also a Django app, but does not share the common logic of the other apps.
18 |
19 | All apps are Django-standard: they have ``models.py``, ``views.py``, ``urls.py``,
20 | ``templates``, ``static``, ``tests``.
21 |
22 | Each app has a module called ``/crawler.py`` that contains the crawler it
23 | uses to download the data from official sources. Each app has a ``/tasks
24 | .py`` with django-rq_ jobs for running the app's crawler.
25 |
26 | Besides a crawler, each app has a package ``/analysis``. This package contains
27 | a list of existing analysis. An analysis is just an expensive operation that is
28 | performed once a day (after data synchronization) and is cached for 24 hours.
29 |
30 | Since ``contracts`` is a large app, its backend is sub-divided:
31 |
32 | * views and urls modules are divided according to whom they refer to
33 | * templates are divided into folders, according to the view they refer to.
34 |
35 | Scheduling
36 | ----------
37 |
38 | We have a periodic job that runs in django-rq_ to synchronize our database with
39 | the official sources and update caches. It uses settings from
40 | ``main/settings_for_schedule.py``.
41 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | Copyright (c) Public contracts and individual contributors.
2 | All rights reserved.
3 |
4 | Redistribution and use in source and binary forms, with or without modification,
5 | are permitted provided that the following conditions are met:
6 |
7 | 1. Redistributions of source code must retain the above copyright notice,
8 | this list of conditions and the following disclaimer.
9 |
10 | 2. Redistributions in binary form must reproduce the above copyright
11 | notice, this list of conditions and the following disclaimer in the
12 | documentation and/or other materials provided with the distribution.
13 |
14 | 3. Neither the name of Public contracts nor the names of its contributors may be used
15 | to endorse or promote products derived from this software without
16 | specific prior written permission.
17 |
18 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
19 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
20 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
21 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
22 | ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
23 | (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
24 | LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
25 | ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
26 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
27 | SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
28 |
--------------------------------------------------------------------------------
/contracts/entity_urls.py:
--------------------------------------------------------------------------------
1 | from django.conf.urls import patterns, url
2 | from django.utils.translation import ugettext_lazy as _
3 |
4 | from . import entity_views
5 | from . import feed
6 |
7 | urlpatterns = patterns('',
8 | url(r'id(\d+)$', entity_views.main_view, name='entity_canonical'),
9 | url(r'id(\d+)/%s$' % _('contracts'), entity_views.contracts, name='entity_contracts'),
10 | url(r'id(\d+)/%s/rss$' % _('contracts'), feed.EntityContractsFeed()),
11 | url(r'id(\d+)/%s/made-time-series$' % _('contracts'),
12 | entity_views.contracts_made_time_series,
13 | name='entity_contracts_made_time_series'),
14 | url(r'id(\d+)/%s/received-time-series$' % _('contracts'),
15 | entity_views.contracts_received_time_series,
16 | name='entity_contracts_received_time_series'),
17 |
18 | url(r'id(\d+)/%s$' % _('customers'), entity_views.costumers, name='entity_costumers'),
19 |
20 | url(r'id(\d+)/%s$' % _('tenders'), entity_views.tenders, name='entity_tenders'),
21 | url(r'id(\d+)/%s/rss$' % _('tenders'), feed.EntityTendersFeed()),
22 |
23 | url(r'id(\d+)/([-\w]+)', entity_views.main_view, name='entity'),
24 | url(r'id(\d+)', entity_views.main_view, name='entity_id'),
25 |
26 | # redirects any entity view to main view.
27 | url(r'(\d+)(/.*)?', entity_views.redirect_id),
28 | )
29 |
--------------------------------------------------------------------------------
/contracts/test/test_commands.py:
--------------------------------------------------------------------------------
1 | from datetime import datetime
2 |
3 | from django.core.management import call_command
4 | from django.test import TestCase
5 |
6 | from contracts.models import Entity, Contract, Category
7 |
8 |
9 | class CommandsTestCase(TestCase):
10 |
11 | def test_cache(self):
12 |
13 | cat = Category.add_root(code='45233141-9')
14 |
15 | e1 = Entity.objects.create(nif='506780902', base_id=5826, name='bla')
16 | e2 = Entity.objects.create(nif='506572218', base_id=101, name='bla1')
17 | e3 = Entity.objects.create(nif='dasda', base_id=21, name='bla2')
18 | e4 = Entity.objects.create(nif='dasda', base_id=22, name='bla2')
19 |
20 | c1 = Contract.objects.create(base_id=1, contract_description='da',
21 | price=200, added_date=datetime(year=2003,
22 | month=1,
23 | day=1),
24 | category=cat)
25 | c2 = Contract.objects.create(base_id=2, contract_description='da',
26 | price=100, added_date=datetime(year=2003,
27 | month=1,
28 | day=1),
29 | category=cat)
30 |
31 | c1.contractors.add(e1)
32 | c1.contracted.add(e3)
33 | c2.contractors.add(e2)
34 | c2.contracted.add(e4)
35 |
36 | call_command('cache_contracts', all=True)
37 |
38 | self.assertTrue(e1.data.is_updated)
39 |
--------------------------------------------------------------------------------
/law/urls.py:
--------------------------------------------------------------------------------
1 | from django.conf.urls import patterns, url
2 | from django.utils.translation import ugettext_lazy as _
3 |
4 | from . import views, views_data, feed
5 |
6 | urlpatterns = patterns(
7 | '',
8 | url(r'^%s/%s$' % (_('law'), _('home')), views.home, name='law_home'),
9 | url(r'^%s/%s$' % (_('law'), _('analysis')), views.analysis_list, name='law_analysis'),
10 |
11 | url(r'^%s/%s/(\d+)/(\w+)' % (_('law'), _('analysis')),
12 | views.law_analysis, name='law_analysis_selector'),
13 | url(r'^%s/%s/(\d+)' % (_('law'), _('analysis')),
14 | views.law_analysis, name='law_analysis_internal_selector',
15 | ),
16 |
17 | url(r'^%s/%s/([-\w]+)/data' % (_('law'), _('analysis')),
18 | views_data.analysis_selector, name='law_data_selector',
19 | ),
20 |
21 | url(r'^%s$' % _('law'), views.law_list, name='law_law_list'),
22 | url(r'^%s/rss$' % _('law'), feed.LawsFeed(), name='laws_list_feed'),
23 |
24 | url(r'^%s$' % _('types'), views.types_list, name='law_types_list'),
25 |
26 | url(r'^%s/(\d+)$' % _('type'), views.type_view),
27 | url(r'^%s/(\d+)/([-\w]+)$' % _('type'), views.type_view, name='law_type'),
28 |
29 | url(r'^%s/(\d+)/rss$' % _('type'), feed.TypeDocumentsFeed()),
30 | url(r'^%s/(\d+)/([-\w]+)/rss$' % _('type'), feed.TypeDocumentsFeed(),
31 | name='type_documents_feed'),
32 |
33 | url(r'^%s/id(\d+)/rss$' % _('document'), feed.DocumentFeed(), name='law_document_rss'),
34 | url(r'^%s/id(\d+)$' % _('document'), views.law_view, name='law_view'),
35 | url(r'^%s/id(\d+)/(.*)' % _('document'), views.law_view, name='law_view'),
36 |
37 | url(r'^%s/(\d+)(/.*)?' % _('document'), views.redirect_id),
38 | )
39 |
--------------------------------------------------------------------------------
/law/crawler_forms.py:
--------------------------------------------------------------------------------
1 | from django.forms import CharField, DateField, IntegerField, Form
2 |
3 | from . import models
4 |
5 |
6 | class TypeField(CharField):
7 | """
8 | Validates a relation to a ``models.Type``.
9 | """
10 | mapping = {'Declaração de Rectificação': 'Declaração de Retificação',
11 | 'Declaração de rectificação': 'Declaração de Retificação',
12 | 'Decreto do Presidente de República':
13 | 'Decreto do Presidente da República',
14 | 'Resolução da Assembleia da República':
15 | 'Resolução da Assembleia da República'}
16 |
17 | def clean(self, string):
18 | type_name = string.strip()
19 |
20 | # synonymous and typos check
21 | if type_name in self.mapping:
22 | type_name = self.mapping[type_name]
23 |
24 | return type_name
25 |
26 |
27 | class DocumentForm(Form):
28 | number = CharField(required=False, max_length=20)
29 |
30 | creator_name = CharField()
31 | date = DateField()
32 | summary = CharField(required=False)
33 | text = CharField(required=False)
34 | dre_doc_id = IntegerField()
35 | dre_pdf_id = IntegerField()
36 | dr_series = CharField(max_length=10)
37 | dr_number = CharField(max_length=10)
38 | dr_supplement = CharField(required=False, max_length=50)
39 | dr_pages = CharField(max_length=50)
40 |
41 | type = TypeField()
42 |
43 | def clean_type(self):
44 | if 'dr_series' not in self.cleaned_data or 'type' not in self.cleaned_data:
45 | return
46 | series = self.cleaned_data['dr_series']
47 | type, created = models.Type.objects.get_or_create(
48 | name=self.cleaned_data['type'], dr_series=series)
49 | return type
50 |
--------------------------------------------------------------------------------
/contracts/templatetags/contracts/humanize.py:
--------------------------------------------------------------------------------
1 | from django import template
2 | from django.template import defaultfilters
3 | from django.utils.translation import ungettext
4 |
5 | register = template.Library()
6 |
7 | # A tuple of standard large number to their converters
8 | intword_converters = (
9 | (0, lambda x: x),
10 | (3, lambda x: ungettext('%(value)s thousand', '%(value)s thousands', x)),
11 | (6, lambda x: ungettext('%(value)s million', '%(value)s millions', x)),
12 | (9, lambda x: ungettext('%(value)s billion', '%(value)s billions', x)),
13 | (12, lambda x: ungettext('%(value)s trillion', '%(value)s trillions', x)),
14 | )
15 |
16 |
17 | @register.filter(is_safe=False)
18 | def intword(value):
19 | """
20 | Converts a large integer to a friendly text representation.
21 | For example, 1000000 becomes '1.0 million',
22 | 1200000 becomes '1.2 millions' and '1200000000' becomes '1.2 billions'.
23 | """
24 | try:
25 | value = float(value)
26 | except TypeError:
27 | # value is None
28 | value = 0
29 | except ValueError:
30 | # not translated to number
31 | return value
32 |
33 | value /= 100. # prices are in cents, we translate them to euros.
34 |
35 | for exponent, converter in intword_converters:
36 | large_number = 10 ** exponent
37 | if value < large_number * 1000:
38 | new_value = value / large_number
39 | new_value = defaultfilters.floatformat(new_value, 1)
40 | return converter(new_value) % {'value': new_value}
41 |
42 | # use the highest available
43 | exponent, converter = intword_converters[-1]
44 | large_number = 10 ** exponent
45 | new_value = value / float(large_number)
46 | new_value = defaultfilters.floatformat(new_value, 1)
47 | return converter(new_value) % {'value': new_value}
48 |
--------------------------------------------------------------------------------
/deputies/analysis/analysis.py:
--------------------------------------------------------------------------------
1 | from datetime import timedelta, date
2 | import collections
3 |
4 | from django.db.models import Count
5 |
6 | from deputies import models
7 |
8 |
9 | def get_time_in_office_distribution():
10 |
11 | data = []
12 |
13 | # build the set of values
14 | for deputy in models.Deputy.objects.all():
15 | total_time = timedelta(0)
16 | for mandate in deputy.mandate_set.all().values('date_end', 'date_start'):
17 | date_end = mandate['date_end']
18 | date_start = mandate['date_start']
19 |
20 | if date_end is None:
21 | date_end = date.today()
22 |
23 | total_time += date_end - date_start
24 |
25 | if total_time:
26 | data.append({'id': deputy.id, 'total': total_time.days})
27 |
28 | # create the histogram
29 | bin_size = 50 # bin of 50 days; 50 just because it looks good enough
30 |
31 | cumulative = []
32 | bin = 0
33 | while True:
34 | cumulative.append({'min': (1 + bin)*bin_size, 'count': 0})
35 |
36 | for datum in data:
37 | if datum['total'] >= cumulative[bin]['min']:
38 | cumulative[bin]['count'] += 1
39 |
40 | if cumulative[bin]['count'] == 0:
41 | cumulative.remove(cumulative[bin])
42 | break
43 |
44 | bin += 1
45 |
46 | return cumulative
47 |
48 |
49 | def get_mandates_in_office_distribution():
50 |
51 | deputies = models.Deputy.objects.all().annotate(count=Count('mandate__id'))
52 |
53 | # build histogram
54 | count = 0
55 | histogram = collections.defaultdict(int)
56 | for deputy in deputies:
57 | histogram[deputy.count] += 1
58 | count += 1
59 |
60 | data = []
61 | for x in histogram:
62 | data.append({'mandates': x, 'count': histogram[x]})
63 |
64 | return data
65 |
--------------------------------------------------------------------------------
/contracts/templates/contracts/analysis/contracted_lorenz_curve/graph.html:
--------------------------------------------------------------------------------
1 | {% load i18n %}
2 |
8 |
9 |
10 |
50 |
--------------------------------------------------------------------------------
/contracts/categories_crawler.py:
--------------------------------------------------------------------------------
1 | import xml.etree.ElementTree
2 |
3 | from contracts.models import Category
4 |
5 |
6 | def get_xml():
7 | """
8 | Gets the xml file from the web.
9 | """
10 | from urllib.request import urlopen
11 |
12 | request = urlopen('https://raw.githubusercontent.com/data-ac-uk/cpv/master/'
13 | 'etc/cpv_2008.xml')
14 |
15 | tree = xml.etree.ElementTree.parse(request)
16 |
17 | return tree.getroot()
18 |
19 |
20 | def _get_data(child):
21 | return {
22 | 'code': child.attrib['CODE'],
23 | 'description_en': child.find('*[@LANG=\'EN\']').text,
24 | 'description_pt': child.find('*[@LANG=\'PT\']').text,
25 | }
26 |
27 |
28 | def _get_depth(code):
29 | pure_code = code[:-2]
30 | depth = 1
31 | while depth != 7:
32 | if pure_code[depth + 1] == '0':
33 | break
34 | depth += 1
35 | return depth
36 |
37 |
38 | def _get_parent(code):
39 | depth = _get_depth(code)
40 | s = list(code[:-2])
41 | while depth != 1:
42 | s[depth] = '0' # pick the parent code
43 | try:
44 | return Category.objects.get(code__startswith="".join(s))
45 | except Category.DoesNotExist:
46 | depth -= 1
47 |
48 | return None # parent not found
49 |
50 |
51 | def add_category(data):
52 |
53 | parent = _get_parent(data['code'])
54 |
55 | try:
56 | category = Category.objects.get(code=data['code'])
57 | except Category.DoesNotExist:
58 | if parent is None:
59 | category = Category.add_root(**data)
60 | else:
61 | category = parent.add_child(**data)
62 |
63 | return category
64 |
65 |
66 | def build_categories():
67 | """
68 | Builds the Categories the xml file. Takes ~2m.
69 | """
70 | for child in get_xml():
71 | data = _get_data(child)
72 | add_category(data)
73 |
--------------------------------------------------------------------------------
/docs/contracts_api/category.rst:
--------------------------------------------------------------------------------
1 | Category
2 | ========
3 |
4 | .. currentmodule:: contracts
5 |
6 | .. _CPVS: http://simap.europa.eu/codes-and-nomenclatures/codes-cpv/codes-cpv_en.htm
7 | .. _Base: http://www.base.gov.pt/base2
8 |
9 | This document provides the API references for the categories of contracts in
10 | the database. See :doc:`../tools/cpvs_importer` for how these categories are
11 | built.
12 |
13 | API References
14 | --------------
15 |
16 | .. class:: models.Category
17 |
18 | A category is a formal way to categorize public contracts within European
19 | Union. It is a tag assigned to a contract.
20 |
21 | A category as an OneToMany relationship to :class:`~models.Contract`: each
22 | contract has one category, each category can have more than one contract.
23 | This relationship is thus defined in the contract model.
24 |
25 | A category has the following attributes:
26 |
27 | .. attribute:: code
28 |
29 | The CPVS_ code of the category.
30 |
31 | .. attribute:: description_en
32 | .. attribute:: description_pt
33 |
34 | The official descriptions of the category in english and portuguese,
35 | respectively.
36 |
37 | .. attribute:: depth
38 |
39 | The depth of the attribute on the tree.
40 |
41 | And has the following methods:
42 |
43 | .. method:: get_children()
44 |
45 | Returns all children categories, excluding itself.
46 |
47 | .. method:: get_ancestors()
48 |
49 | Returns all ancestor categories, excluding itself.
50 |
51 | .. method:: get_absolute_url()
52 |
53 | Returns the url of this category in the website.
54 |
55 | .. method:: contracts_count()
56 |
57 | Counts the number of all contracts that belong to this category or any
58 | of its children.
59 |
60 | .. method:: contracts_price()
61 |
62 | Sums the prices of all contracts that belong to this category or any of
63 | its children.
64 |
--------------------------------------------------------------------------------
/contracts/urls.py:
--------------------------------------------------------------------------------
1 | from django.conf.urls import patterns, url, include
2 | from django.utils.translation import ugettext_lazy as _
3 |
4 | from . import feed
5 | from . import views
6 | from . import views_analysis
7 | from . import views_data
8 |
9 | from . import entity_urls
10 | from . import contract_urls
11 | from . import category_urls
12 |
13 |
14 | urlpatterns = patterns(
15 | '',
16 | url(r'^%s/%s$' % (_('contracts'), _('home')),
17 | views.home, name='contracts_home'),
18 |
19 | url(r'^%s/%s$' % (_('contracts'), _('analysis')),
20 | views_analysis.AnalysisListView.as_view(), name='contracts_analysis'),
21 |
22 | url(r'^%s/%s/(\d+)/(\w+)' % (_('contracts'), _('analysis')),
23 | views_analysis.analysis_selector,
24 | name='contracts_analysis_selector'),
25 | url(r'^%s/%s/(\d+)' % (_('contracts'), _('analysis')),
26 | views_analysis.analysis_selector,
27 | name='contracts_internal_analysis_selector',
28 | ),
29 |
30 | url(r'^%s/%s/([-\w]+)/data' % (_('contracts'), _('analysis')),
31 | views_data.analysis_selector,
32 | name='contracts_data_selector',
33 | ),
34 |
35 | url(r'%s/' % _('contract'), include(contract_urls)),
36 | url(r'%s/' % _('entity'), include(entity_urls)),
37 | url(r'%s/' % _('category'), include(category_urls)),
38 | )
39 |
40 | # lists
41 | urlpatterns += patterns(
42 | '',
43 | url(r'^%s$' % _('contracts'), views.contracts_list, name='contracts_list'),
44 | url(r'^%s/rss$' % _('contracts'), feed.ContractsFeed(),
45 | name='contracts_list_feed'),
46 |
47 | url(r'^%s$' % _('categories'), views.categories_list,
48 | name='categories_list'),
49 |
50 | url(r'^%s$' % _('tenders'), views.tenders_list, name='tenders_list'),
51 | url(r'^%s/rss$' % _('tenders'), feed.TendersFeed(),
52 | name='tenders_list_feed'),
53 |
54 | url(r'^%s$' % _('entities'), views.entities_list, name='entities_list'),
55 | )
56 |
--------------------------------------------------------------------------------
/contracts/templates/contracts/analysis/procedure_type_time_series/main.html:
--------------------------------------------------------------------------------
1 | {% extends "contracts/base.html" %}
2 | {% load i18n %}
3 | {% load static %}
4 | {% block title %}Evolução temporal dos procedimentos usados na contratação pública{% endblock %}
5 | {% block content %}
6 |
7 |
Evolução temporal dos procedimentos usados na contratação pública
8 |
9 | A contratação pública em Portugal é feita maioritariamente por
10 |
11 |
12 |
13 | ajuste direto,
14 |
15 |
16 | concurso público,
17 |
18 |
19 | concurso limitado por prévia qualificação ou
20 |
21 |
22 | acordo quadro.
23 |
24 |
25 |
26 | Do ponto de vista legislativo é relevante saber quanto cada um é
27 | usado pois permite uma análise quantitativa de quais os tipos
28 | relevantes na lei.
29 |
30 |
31 | Considerámos todos os contratos da base de dados pública
32 | fechados em cada mês desde 2010 e cálculamos quantos são realizados
33 | pelos diferentes procedimentos, e a que valor esses tipos correspondem.
34 |
35 | {% url 'contracts_data_selector' 'procedure-types-time-series-json' as the_url %}
36 | {% include "contracts/analysis/procedure_type_time_series/graph.html" with url=the_url%}
37 |
38 | {% url 'contracts_data_selector' 'procedure-types-time-series-json' as the_url %}
39 | {% include "contracts/analysis/procedure_type_time_series/graph_count.html" with url=the_url%}
40 |
41 | {% endblock %}
42 |
--------------------------------------------------------------------------------
/docs/website/website.rst:
--------------------------------------------------------------------------------
1 | Install development environment
2 | ===============================
3 |
4 | This part of the documentation explains how you can jump to the development of
5 | publicos.pt and deploy the website on your computer. To only interact with the
6 | database, e.g. to do statistics, you only need to
7 | :doc:`install the API dependencies <../installation>`.
8 |
9 | We assume here that you know Python and a minimum of Django.
10 |
11 | Dependencies for the website
12 | ----------------------------
13 |
14 | Besides the :doc:`dependencies of the API <../installation>`, the website uses
15 | the following packages:
16 |
17 | BeautifulSoup 4
18 | ^^^^^^^^^^^^^^^
19 |
20 | For crawling websites, we use a Python package to handle HTML elements. To
21 | install it, use::
22 |
23 | pip install beautifulsoup4
24 |
25 | django-debug-toolbar
26 | ^^^^^^^^^^^^^^^^^^^^
27 |
28 | To develop, we use ``django-debug-toolbar``, an utility to debug Django websites::
29 |
30 | pip install django-debug-toolbar
31 |
32 | Running the website
33 | -------------------
34 |
35 | Once you have the dependencies installed, you can run the website from the root
36 | directory using::
37 |
38 | python manage.py runserver
39 |
40 | and enter in the url ``http://127.0.0.1:8000``.
41 |
42 | .. _`mailing list`: https://groups.google.com/forum/#!forum/public-contracts
43 |
44 | If anything went wrong or you have any question,
45 | please drop by our `mailing list`_ so we can help you.
46 |
47 |
48 | Running tests
49 | -------------
50 |
51 | We use standard Django unit test cases. To run tests, use::
52 |
53 | python manage.py test
54 |
55 | For instance, for running the test suite of contracts app, run::
56 |
57 | python manage.py test contracts.tests
58 |
59 |
60 | Running the crawler
61 | -------------------
62 |
63 | To run the :doc:`../tools/crawler` to populate the database,
64 | you require an additional package::
65 |
66 | pip install requests
67 |
68 |
--------------------------------------------------------------------------------
/contracts/templates/contracts/analysis/municipalities_procedure_type_time_series/main.html:
--------------------------------------------------------------------------------
1 | {% extends "contracts/base.html" %}
2 | {% load i18n %}
3 | {% load static %}
4 | {% block title %}Que procedimentos os municípios usam na contratação pública?{% endblock %}
5 | {% block content %}
6 |
7 |
Que procedimentos os municípios usam na sua contratação pública?
8 |
9 | A contratação pública em Portugal é feita maioritariamente por
10 |
11 |
12 |
13 | ajuste direto,
14 |
15 |
16 | concurso público,
17 |
18 |
19 | concurso limitado por prévia qualificação ou
20 |
21 |
22 | acordo quadro.
23 |
24 |
25 |
26 | Do ponto de vista legislativo é relevante saber quanto cada um é
27 | usado pois permite uma análise quantitativa de quais os tipos
28 | relevantes na lei.
29 |
30 |
31 | Considerámos todos os contratos da base de dados pública
32 | fechados em cada mês por municípios desde 2010 e cálculamos quantos são realizados
33 | pelos diferentes procedimentos, e a que valor esses tipos correspondem.
34 |
35 | {% url 'contracts_data_selector' 'municipalities-procedure-types-time-series-json' as the_url %}
36 | {% include "contracts/analysis/procedure_type_time_series/graph.html" with url=the_url%}
37 |
38 | {% url 'contracts_data_selector' 'municipalities-procedure-types-time-series-json' as the_url %}
39 | {% include "contracts/analysis/procedure_type_time_series/graph_count.html" with url=the_url%}
40 |
9 | {% blocktrans trimmed with url="http://www.base.gov.pt/base2/" %}
10 | By European law, portuguese public entities must publish their contracts in an
11 | official public database. With this database,
12 | we posed the question: when Portugal hires most?
13 | {% endblocktrans %}
14 |
15 |
16 | {% blocktrans trimmed %}
17 | We picked all contracts in that database and count how many of then were signed on each month
18 | and what was their value. Here is the result:
19 | {% endblocktrans %}
20 |
21 |
22 |
{% trans "All contracts" %}:
23 | {% url 'contracts_data_selector' 'contracts-time-series-json' as the_url %}
24 | {% include "contracts/analysis/contracts_time_series/graph.html" with url=the_url %}
25 |
{% trans "Municipalities only" %}:
26 | {% url 'contracts_data_selector' 'municipalities-contracts-time-series-json' as the_url %}
27 | {% include "contracts/analysis/contracts_time_series/graph.html" with url=the_url %}
28 |
10 |
62 |
--------------------------------------------------------------------------------
/docs/contracts_api/contract.rst:
--------------------------------------------------------------------------------
1 | Contract
2 | ========
3 |
4 | .. _Base: http://www.base.gov.pt/base2
5 |
6 | This document provides the API references for the contracts in the database.
7 |
8 | .. currentmodule:: contracts
9 |
10 | API
11 | ----
12 |
13 | .. class:: models.Contract
14 |
15 | A contract is a relationship between :doc:`entities ` enrolled in the database.
16 | Each contract has a set of contractors (who paid), and contracted (who were paid),
17 | and relevant information about the contract.
18 |
19 | All the fields of this model are retrieved from Base_. Except for :attr:`base_id`
20 | and :attr:`added_date`, all entries can be null if they don't exist in
21 | Base_. They are:
22 |
23 | .. attribute:: description
24 |
25 | The description of the contract.
26 |
27 | .. attribute:: price
28 |
29 | The price of the contract, in cents (to be an integer).
30 |
31 | .. attribute:: category
32 |
33 | A Foreign key to the contract :doc:`category`.
34 |
35 | .. attribute:: contractors
36 |
37 | A ManyToMany relationship with :doc:`entities `. Related name "contracts_made".
38 |
39 | .. attribute:: contracted
40 |
41 | A ManyToMany relationship with :doc:`entities `.
42 |
43 | .. attribute:: added_date
44 |
45 | The date it was added to Base_ database.
46 |
47 | .. attribute:: signing_date
48 |
49 | The date it was signed.
50 |
51 | .. attribute:: close_date
52 |
53 | The date is was closed. It is normally null.
54 |
55 | .. attribute:: base_id
56 |
57 | The primary key of the contract on the Base_ database.
58 | It is "unique" and can used to create the link to Base (see :meth:`get_base_url`)
59 |
60 | .. attribute:: contract_type
61 |
62 | A Foreign key to one of the types of contracts.
63 |
64 | .. attribute:: procedure_type
65 |
66 | A Foreign key to one of the types of procedures.
67 |
68 | .. attribute:: contract_description
69 |
70 | A text about the object of the contract (i.e. what was bought or sold).
71 |
72 | .. attribute:: country
73 | .. attribute:: district
74 | .. attribute:: council
75 |
76 | Country, district, council.
77 |
78 | This model has a getter:
79 |
80 | .. method:: get_base_url()
81 |
82 | Returns the url of this entity in Base_.
83 |
--------------------------------------------------------------------------------
/docs/contracts_api/entity.rst:
--------------------------------------------------------------------------------
1 | Entity
2 | ======
3 |
4 | .. _Base: http://www.base.gov.pt/base2
5 |
6 | This document provides the API references for the entities in the database.
7 |
8 | .. currentmodule:: contracts
9 |
10 | API
11 | ----
12 |
13 | .. class:: models.Entity
14 |
15 | An entity is any company or institution that is enrolled in the database that are related
16 | trough :class:`contracts `.
17 |
18 | All the fields of this model are directly retrieved from Base_.
19 | They are:
20 |
21 | .. attribute:: name
22 |
23 | The name of the entity.
24 |
25 | .. attribute:: base_id
26 |
27 | The primary key of the entity on the Base_ database. It is "unique".
28 |
29 | .. attribute:: nif
30 |
31 | The fiscal identification number of the entity.
32 |
33 | .. attribute:: country
34 |
35 | The country it is registered. It may be "null" when there is no such information.
36 |
37 | It has the following getters:
38 |
39 | .. method:: total_earned()
40 |
41 | Returns the total earned, in € cents, a value stored in :class:`~models.EntityData`.
42 |
43 | .. method:: total_expended()
44 |
45 | Returns the total expended, in € cents, a value stored in :class:`~models.EntityData`.
46 |
47 | .. method:: get_base_url()
48 |
49 | Returns the url of this entity in Base_.
50 |
51 | .. method:: get_absolute_url()
52 |
53 | Returns the url of this entity on this website.
54 |
55 | And the following setters:
56 |
57 | .. method:: compute_data()
58 |
59 | Computes two aggregations and stores them in its EntityData:
60 |
61 | #) the total value in € of the contracts where it is a contractor
62 | #) the total value in € of the contracts where it is contracted
63 |
64 | This method is used when the crawler updates new contracts.
65 |
66 | .. class:: models.EntityData
67 |
68 | Data of an entity that is not retrieved from Base, i.e, it is computed with existing data.
69 | It is a kind of cache, but more persistent. This may become a proper cache in future.
70 |
71 | It has a OneToOne relationship with :class:`~models.Entity`.
72 |
73 | As the following attributes:
74 |
75 | .. attribute:: total_earned
76 |
77 | The money, in cents, the entity earned so far.
78 |
79 | .. attribute:: total_expended
80 |
81 | The money, in cents, the entity expended so far.
82 |
--------------------------------------------------------------------------------
/main/templates/base.html:
--------------------------------------------------------------------------------
1 | {% load staticfiles %}
2 |
3 |
4 |
5 | {% block title %}{{ SITE_NAME }}{% endblock %}
6 |
7 |
8 |
9 | {% include "google_analytics.html" %}
10 |
11 |
12 |
13 | {# boostrap #}
14 |
15 |
16 |
17 |
18 | {% if REQUIRE_DATEPICKER %}
19 | {# moment.js to daterangepicker.js #}
20 |
21 |
22 |
23 | {# daterangepicker.js #}
24 |
25 |
26 | {% endif %}
27 |
28 | {% if REQUIRE_D3JS %}
29 | {# d3.js #}
30 |
31 |
32 | {# nv.d3.js #}
33 |
34 |
35 | {% endif %}
36 |
37 | {% block extra_html_head %}{% endblock %}
38 |
39 | {% include "base/search_js.html" %}
40 | {% include "change_parameters.html" %}
41 |
42 |
43 |
44 | {% include "base/top_header.html" %}
45 |
49 | {% include "base/footer.html" %}
50 |
51 |
52 |
--------------------------------------------------------------------------------
/law/feed.py:
--------------------------------------------------------------------------------
1 | from __future__ import unicode_literals
2 |
3 | from django.contrib.syndication.views import Feed
4 | from django.core.urlresolvers import reverse
5 | from django.shortcuts import get_object_or_404
6 | from django.utils import formats
7 | from django.utils.translation import ugettext as _
8 |
9 | from .models import Document, Type
10 |
11 |
12 | class GeneralDocumentFeed(Feed):
13 | def item_title(self, item):
14 | return _('%(name)s - %(date)s - %(creator)s') % \
15 | {'name': item.name(),
16 | 'date': formats.date_format(item.date),
17 | 'creator': item.creator_name}
18 |
19 | def item_description(self, item):
20 | return item.summary
21 |
22 | def items(self, obj):
23 | return self._items(obj).order_by('-date', 'type__name', '-number') \
24 | .prefetch_related('type')[:200]
25 |
26 | def _items(self, obj):
27 | raise NotImplementedError
28 |
29 |
30 | class LawsFeed(GeneralDocumentFeed):
31 | def title(self, __):
32 | return _("Série I of Diário da República")
33 |
34 | def link(self, _):
35 | return reverse('law_law_list')
36 |
37 | def description(self, __):
38 | return _("All documents published in Série I of Diário da República")
39 |
40 | def _items(self, _):
41 | return Document.objects.all()
42 |
43 |
44 | class TypeDocumentsFeed(GeneralDocumentFeed):
45 | def get_object(self, request, type_id, slug=None):
46 | return get_object_or_404(Type, id=type_id)
47 |
48 | def title(self, obj):
49 | return obj.name
50 |
51 | def link(self, obj):
52 | return obj.get_absolute_url()
53 |
54 | def description(self, obj):
55 | return _("Documents") + "\"%s\"" % obj.name
56 |
57 | def _items(self, obj):
58 | return obj.document_set.all()
59 |
60 |
61 | class DocumentFeed(GeneralDocumentFeed):
62 |
63 | def get_object(self, request, dre_doc_id, slug=None):
64 | return get_object_or_404(Document.objects.select_related('type'),
65 | dre_doc_id=dre_doc_id)
66 |
67 | def title(self, obj):
68 | return _("Laws referring %s") % obj.name()
69 |
70 | def link(self, obj):
71 | return obj.get_absolute_url()
72 |
73 | def description(self, obj):
74 | return _("All documents published in Série I of Diário da República "
75 | "that refer %s") % obj.name()
76 |
77 | def _items(self, obj):
78 | return obj.document_set.all()
79 |
--------------------------------------------------------------------------------
/docs/installation.rst:
--------------------------------------------------------------------------------
1 | Installation
2 | ============
3 |
4 | This document explains how you can install the API for *accessing* the database.
5 | To deploy the website, see :doc:`website/website`.
6 |
7 | .. _pip: https://pypi.python.org/pypi/pip
8 |
9 | We assume you know a little of Python and know how to install Python packages
10 | in your computer using pip_.
11 |
12 | Getting the source
13 | ------------------
14 |
15 | .. _GitHub: https://github.com/jorgecarleitao/public-contracts
16 | .. _downloaded: https://github.com/jorgecarleitao/public-contracts/archive/master.zip
17 | .. _mailing-list: https://groups.google.com/forum/#!forum/public-contracts
18 |
19 | The source can be either downloaded_ or cloned from the GitHub_ repository using::
20 |
21 | git clone https://github.com/jorgecarleitao/public-contracts.git
22 |
23 | Like most Python code, the source doesn't need to be installed; you just have to
24 | put it somewhere in your computer.
25 |
26 | You need Python 3.
27 |
28 | Dependencies
29 | ------------
30 |
31 | For using the code, you need to install three python packages:
32 |
33 | Django
34 | ^^^^^^
35 |
36 | We use Django ORM to abstract ourselves of the idea of database and use Python
37 | classes to work with the data::
38 |
39 | pip install Django
40 |
41 | Postgres
42 | ^^^^^^^^
43 |
44 | Our remote database is in postgres. To Python communicate with it, we need a
45 | binding::
46 |
47 | pip install psycopg2
48 |
49 | treebeard
50 | ^^^^^^^^^
51 |
52 | The categories in the database are organized in a :doc:`tree structure
53 | `. We use django-treebeard to efficiently storage them
54 | in our database.
55 |
56 | Install using::
57 |
58 | pip install django-treebeard
59 |
60 | Running the example
61 | -------------------
62 |
63 | .. _official number: http://www.base.gov.pt/base2/html/pesquisas/contratos.shtml
64 |
65 | Once you have the dependencies installed, enter in its directory and run::
66 |
67 | python -m contracts.tools.example
68 |
69 | If everything went well, it outputs two numbers:
70 |
71 | 1. the total number of contracts in the database, that you can corroborate
72 | with the `official number`_.
73 | 2. the sum of the values of all contracts.
74 |
75 | If some problem occur, please add an [issue](https://github.com/jorgecarleitao/public-contracts/issues)
76 | so we can help you and improve these instructions.
77 |
78 | From here, you can see section :doc:`usage` for a tutorial, and section
79 | :doc:`api` for the complete documentation.
80 |
--------------------------------------------------------------------------------
/law/views_data.py:
--------------------------------------------------------------------------------
1 | import json
2 |
3 | from django.http import HttpResponse, Http404
4 |
5 | from law.analysis import analysis_manager
6 |
7 |
8 | def law_types_time_series_json(_):
9 | data, types_total = analysis_manager.get_analysis('law_types_time_series')
10 |
11 | total_documents = sum(types_total.values())
12 |
13 | types_time_series = {}
14 | total_presented = 0
15 | for type_name in sorted(types_total, key=types_total.get, reverse=True)[:17]:
16 | count = types_total[type_name]
17 | types_time_series[type_name] = {'values': [], 'key': type_name,
18 | 'total': count}
19 |
20 | total_presented += count
21 |
22 | # create a special entry for "others"
23 | # we don't translate 'Outros' since no type is translated to English.
24 | others_time_series = {'values': [], 'key': 'Outros',
25 | 'total': total_documents - total_presented}
26 |
27 | # populate the values from the analysis
28 | for entry in data:
29 | # append a new entry for this time-point
30 | for type_name in types_time_series:
31 | types_time_series[type_name]['values'].append(
32 | {'year': entry['from'].strftime('%Y'),
33 | 'value': 0})
34 | others_time_series['values'].append(
35 | {'year': entry['from'].strftime('%Y'),
36 | 'value': 0})
37 |
38 | # add value to type or "others"
39 | for type_name in entry['value']:
40 | if type_name == 'Total':
41 | continue
42 |
43 | value = entry['value'][type_name]
44 | if type_name in types_time_series:
45 | types_time_series[type_name]['values'][-1]['value'] += value
46 | else:
47 | others_time_series['values'][-1]['value'] += value
48 |
49 | # sort them by total
50 | types_time_series = list(types_time_series.values())
51 | types_time_series = sorted(types_time_series,
52 | key=lambda x: x['total'], reverse=True)
53 | # and finally, append the "Others" in the end of the list
54 | types_time_series.append(others_time_series)
55 |
56 | return HttpResponse(json.dumps(types_time_series),
57 | content_type="application/json")
58 |
59 |
60 | AVAILABLE_VIEWS = {
61 | 'law-types-time-series-json': law_types_time_series_json,
62 | }
63 |
64 |
65 | def analysis_selector(request, analysis_name):
66 | if analysis_name not in AVAILABLE_VIEWS:
67 | raise Http404
68 |
69 | return AVAILABLE_VIEWS[analysis_name](request)
70 |
--------------------------------------------------------------------------------
/contracts/management/commands/populate_contracts.py:
--------------------------------------------------------------------------------
1 | from django.core.management.base import BaseCommand
2 |
3 | from contracts.crawler import ContractsCrawler, EntitiesCrawler, TendersCrawler, \
4 | ContractsStaticDataCrawler
5 | from contracts.categories_crawler import build_categories
6 |
7 | from contracts.models import Category, ProcedureType
8 |
9 |
10 | class Command(BaseCommand):
11 | help = "Populates the database."
12 |
13 | def add_arguments(self, parser):
14 | parser.add_argument(
15 | '--contracts',
16 | action='store_true',
17 | help='Populates contracts')
18 |
19 | parser.add_argument(
20 | '--entities',
21 | action='store_true',
22 | help='Populates entities')
23 |
24 | parser.add_argument(
25 | '--tenders',
26 | action='store_true',
27 | help='Populates tenders')
28 |
29 | parser.add_argument(
30 | '--categories',
31 | action='store_true',
32 | help='Populates categories. '
33 | 'Typically only needs to run once with this option.')
34 |
35 | parser.add_argument(
36 | '--static',
37 | action='store_true',
38 | help='Populates db with static data. '
39 | 'Typically only needs to run once with this option.')
40 |
41 | parser.add_argument(
42 | '--bootstrap',
43 | action='store_true',
44 | help='Synchronizes the database from scratch. WARNING: may take days.')
45 |
46 | def handle(self, **options):
47 | if options['static']:
48 | if options['bootstrap'] or not ProcedureType.objects.exists():
49 | ContractsStaticDataCrawler().retrieve_and_save_all()
50 |
51 | if options['categories']:
52 | if options['bootstrap'] or not Category.objects.exists():
53 | build_categories()
54 |
55 | if options['entities']:
56 | crawler = EntitiesCrawler()
57 | if options['bootstrap']:
58 | crawler.update(0)
59 | else:
60 | crawler.update(-2000)
61 |
62 | if options['contracts']:
63 | crawler = ContractsCrawler()
64 | if options['bootstrap']:
65 | crawler.update(0)
66 | else:
67 | crawler.update(-2000)
68 |
69 | if options['tenders']:
70 | crawler = TendersCrawler()
71 | if options['bootstrap']:
72 | crawler.update(0)
73 | else:
74 | crawler.update(-2000)
75 |
--------------------------------------------------------------------------------
/law/models.py:
--------------------------------------------------------------------------------
1 | import logging
2 |
3 | # Get an instance of a logger
4 | logger = logging.getLogger(__name__)
5 |
6 | from django.core.urlresolvers import reverse
7 | from django.db import models
8 | from django.utils.text import slugify
9 |
10 |
11 | class Type(models.Model):
12 | name = models.CharField(max_length=254)
13 | dr_series = models.CharField(max_length=10)
14 |
15 | def get_absolute_url(self):
16 | name = "%s" % slugify(self.name)
17 | return reverse('law_type', args=[self.pk, name])
18 |
19 | def __unicode__(self):
20 | return self.name
21 |
22 | class Meta:
23 | unique_together = ('name', 'dr_series')
24 |
25 |
26 | class Document(models.Model):
27 | type = models.ForeignKey(Type)
28 | number = models.CharField(max_length=20, null=True)
29 |
30 | creator_name = models.TextField()
31 |
32 | date = models.DateField()
33 |
34 | # summary of the publication.
35 | # Some documents don't have summary.
36 | summary = models.TextField(null=True)
37 | # text of the publication, in HTML.
38 | # Some documents don't have the text available.
39 | text = models.TextField(null=True)
40 |
41 | # Where it can be found in the internet (DRE)
42 | dre_doc_id = models.IntegerField(unique=True, db_index=True)
43 | dre_pdf_id = models.IntegerField(unique=True, db_index=True)
44 |
45 | # Where it can be found in the official book (DR)
46 | dr_series = models.CharField(max_length=10, db_index=True)
47 | dr_number = models.CharField(max_length=10)
48 | dr_supplement = models.CharField(max_length=50, null=True)
49 | dr_pages = models.CharField(max_length=50)
50 |
51 | # documents it refers to.
52 | references = models.ManyToManyField('self', symmetrical=False)
53 |
54 | def get_pdf_url(self):
55 | return "https://dre.pt/application/file/a/%d" % self.dre_pdf_id
56 |
57 | def get_absolute_url(self):
58 | name = "%s" % slugify(self.type.name)
59 | if self.number:
60 | name += '-%s' % self.number
61 | else:
62 | name += '-de-' + self.date.strftime('%d/%m/%Y')
63 | return reverse('law_view', args=[self.dre_doc_id, name])
64 |
65 | def get_minimal_url(self):
66 | return reverse('law_view', args=[self.dre_doc_id])
67 |
68 | def compose_summary(self):
69 | return self.summary
70 |
71 | def name(self):
72 | name = self.type.name
73 |
74 | if self.number:
75 | name = name + ' ' + self.number
76 | return name
77 |
78 | def update_references(self):
79 | from . import composer
80 | documents = composer.get_references(self).values_list('id', flat=True)
81 | self.references.add(*list(documents))
82 |
--------------------------------------------------------------------------------
/deputies/models.py:
--------------------------------------------------------------------------------
1 | from django.core.urlresolvers import reverse
2 | from django.db import models
3 | from django.db.models import Max
4 |
5 |
6 | class Deputy(models.Model):
7 | """
8 | A deputy.
9 | """
10 | official_id = models.IntegerField()
11 | name = models.CharField(max_length=254)
12 | birthday = models.DateField(null=True) # some deputies don't have birthday info.
13 |
14 | party = models.ForeignKey("Party", null=True) # the current party, updated on legislatures
15 |
16 | is_active = models.BooleanField(default=False) # if the deputy's last mandate is on the current legislature.
17 |
18 | def __unicode__(self):
19 | return 'deputy ' + self.name
20 |
21 | def get_absolute_url(self):
22 | return 'http://www.parlamento.pt/DeputadoGP/Paginas/Biografia.aspx?BID=%d' % self.official_id
23 |
24 | def update(self):
25 | mandate = self.mandate_set.all()[:1].get()
26 | if mandate.legislature.number == Legislature.objects.all().aggregate(max=Max("number"))['max']:
27 | self.is_active = True
28 | else:
29 | self.is_active = False
30 | self.party = mandate.party
31 | self.save()
32 |
33 | def get_image_url(self):
34 | return 'http://app.parlamento.pt/webutils/getimage.aspx?id=%d&type=deputado' % self.official_id
35 |
36 | class Meta:
37 | ordering = ['name']
38 |
39 |
40 | class Party(models.Model):
41 | """
42 | A Party
43 | """
44 | abbrev = models.CharField(max_length=20)
45 |
46 | def get_absolute_url(self):
47 | return reverse('party_view', args=[self.pk])
48 |
49 | def __unicode__(self):
50 | return self.abbrev
51 |
52 |
53 | class Legislature(models.Model):
54 | """
55 | A period between two elections.
56 | """
57 | number = models.PositiveIntegerField()
58 |
59 | date_start = models.DateField()
60 | date_end = models.DateField(null=True) # if it haven't finished yet, it is null.
61 |
62 | def __unicode__(self):
63 | return self.number
64 |
65 |
66 | class Mandate(models.Model):
67 | """
68 | A mandate of a deputy within a legislature.
69 | He must be set by a party and from a constituency
70 | """
71 | deputy = models.ForeignKey(Deputy)
72 | legislature = models.ForeignKey(Legislature)
73 | district = models.ForeignKey('contracts.District', null=True) # null if outside Portugal
74 | party = models.ForeignKey(Party)
75 |
76 | # he can leave or enter in the middle of the legislature
77 | date_start = models.DateField()
78 | date_end = models.DateField(null=True) # if it haven't finished yet, it is null.
79 |
80 | def __unicode__(self):
81 | return '%s (%s - %s)' % (self.deputy.name, self.legislature, self.party.abbrev)
82 |
83 | class Meta:
84 | ordering = ['-legislature__number']
85 |
--------------------------------------------------------------------------------
/law/test/test_crawler.py:
--------------------------------------------------------------------------------
1 | from django.test import TestCase
2 |
3 | from pt_law_downloader import get_publication, get_document
4 | from law.crawler import save_publication, save_document
5 | from law.crawler_forms import DocumentForm
6 | from law.models import Type, Document
7 |
8 |
9 | class TestCrawler(TestCase):
10 |
11 | def test_form(self):
12 | """
13 | Notice that `type` slightly changes, since we map it to the corrent term.
14 | """
15 | data = {'type': 'Declaração de Rectificação',
16 | 'number': '1/2000',
17 | 'creator_name': 'Min',
18 | 'date': '2013-02-01',
19 | 'summary': 'adda',
20 | 'text': 'text',
21 | 'dre_doc_id': 2,
22 | 'dre_pdf_id': 2,
23 | 'dr_series': 'I',
24 | 'dr_number': '3',
25 | 'dr_supplement': None,
26 | 'dr_pages': '1-2'
27 | }
28 |
29 | DocumentForm(data)
30 | form = DocumentForm(data)
31 |
32 | self.assertTrue(form.is_valid())
33 |
34 | self.assertEqual(1, Type.objects.count())
35 |
36 | self.assertEqual(1, Type.objects.filter(name='Declaração de Retificação',
37 | dr_series='I').count())
38 |
39 | def test_valid_insertion(self):
40 | pub1 = get_publication(544590)
41 | pub2 = get_publication(67040491)
42 | doc = {'series': 'I', 'supplement': None, 'number': '1/2000', 'dre_id': 1}
43 | save_publication(pub1, doc)
44 | save_publication(pub2, doc)
45 | self.assertEqual(2, Document.objects.count())
46 | self.assertEqual(1, Type.objects.filter(name='Portaria', dr_series='I').count())
47 |
48 | pub = Document.objects.get(dre_doc_id=67040491)
49 |
50 | self.assertEqual(1, pub.references.count())
51 |
52 | def test_invalid_insertion(self):
53 | pub = get_publication(544590)
54 | doc = {'series': 'I'*30, 'supplement': None, 'number': '1', 'dre_id': 1}
55 | with self.assertRaises(ValueError):
56 | save_publication(pub, doc)
57 |
58 | def test_638275(self):
59 | """
60 | This is an edge case, thus we test it explicitly.
61 | """
62 | pub = get_publication(638275)
63 | doc = {'series': 'I', 'supplement': None, 'number': '1', 'dre_id': 1}
64 | save_publication(pub, doc)
65 |
66 | pub = Document.objects.get(dre_doc_id=638275)
67 |
68 | self.assertEqual('4/93', pub.number)
69 |
70 | def test_save_doc(self):
71 | save_document(get_document(67142145))
72 |
73 | def test_empty_doc(self):
74 | """
75 | Test that empty publications are ignored.
76 | """
77 | doc = {'publications': [{'dre_id': None}, {'dre_id': None}]}
78 | save_document(doc)
79 | self.assertEqual(0, Type.objects.count())
80 | self.assertEqual(0, Document.objects.count())
81 |
--------------------------------------------------------------------------------
/deputies/migrations/0001_initial.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | from __future__ import unicode_literals
3 |
4 | from django.db import models, migrations
5 |
6 |
7 | class Migration(migrations.Migration):
8 |
9 | dependencies = [
10 | ('contracts', '0001_initial'),
11 | ]
12 |
13 | operations = [
14 | migrations.CreateModel(
15 | name='Deputy',
16 | fields=[
17 | ('id', models.AutoField(primary_key=True, auto_created=True, verbose_name='ID', serialize=False)),
18 | ('official_id', models.IntegerField()),
19 | ('name', models.CharField(max_length=254)),
20 | ('birthday', models.DateField(null=True)),
21 | ('is_active', models.BooleanField(default=False)),
22 | ],
23 | options={
24 | 'ordering': ['name'],
25 | },
26 | bases=(models.Model,),
27 | ),
28 | migrations.CreateModel(
29 | name='Legislature',
30 | fields=[
31 | ('id', models.AutoField(primary_key=True, auto_created=True, verbose_name='ID', serialize=False)),
32 | ('number', models.PositiveIntegerField()),
33 | ('date_start', models.DateField()),
34 | ('date_end', models.DateField(null=True)),
35 | ],
36 | options={
37 | },
38 | bases=(models.Model,),
39 | ),
40 | migrations.CreateModel(
41 | name='Mandate',
42 | fields=[
43 | ('id', models.AutoField(primary_key=True, auto_created=True, verbose_name='ID', serialize=False)),
44 | ('date_start', models.DateField()),
45 | ('date_end', models.DateField(null=True)),
46 | ('deputy', models.ForeignKey(to='deputies.Deputy')),
47 | ('district', models.ForeignKey(null=True, to='contracts.District')),
48 | ('legislature', models.ForeignKey(to='deputies.Legislature')),
49 | ],
50 | options={
51 | 'ordering': ['-legislature__number'],
52 | },
53 | bases=(models.Model,),
54 | ),
55 | migrations.CreateModel(
56 | name='Party',
57 | fields=[
58 | ('id', models.AutoField(primary_key=True, auto_created=True, verbose_name='ID', serialize=False)),
59 | ('abbrev', models.CharField(max_length=20)),
60 | ],
61 | options={
62 | },
63 | bases=(models.Model,),
64 | ),
65 | migrations.AddField(
66 | model_name='mandate',
67 | name='party',
68 | field=models.ForeignKey(to='deputies.Party'),
69 | preserve_default=True,
70 | ),
71 | migrations.AddField(
72 | model_name='deputy',
73 | name='party',
74 | field=models.ForeignKey(null=True, to='deputies.Party'),
75 | preserve_default=True,
76 | ),
77 | ]
78 |
--------------------------------------------------------------------------------
/law/templates/law/home.html:
--------------------------------------------------------------------------------
1 | {% extends "law/base.html" %}
2 | {% load i18n %}
3 | {% load static %}
4 |
5 | {% block title %}{% trans "Public Law"%}{% endblock %}
6 | {% block content %}
7 |
8 |
9 |
10 | {% trans "Welcome to Public Law" %}
11 |
12 |
13 |
14 | {% blocktrans trimmed with github_url="https://github.com/jorgecarleitao/public-contracts" dre_url="http://dre.pt/" %}
15 | In this project we use the official database of the portuguese law
16 | to explore some of its properties.
17 | {% endblocktrans %}
18 |
19 |
20 |
{% trans "What can you find here?" %}
21 |
22 | {% blocktrans trimmed %}
23 | This section of the website contains all documents published in the Series I
24 | of "Diário da República" - the official publication for portuguese laws.
25 | {% endblocktrans %}
26 |
27 |
28 | {% blocktrans trimmed %}
29 | This accounts for all laws since 1910, the beginning of the portuguese Republic,
30 | until today: the database is daily synchronized with the official database.
31 | {% endblocktrans %}
32 |
33 |
34 | {% url 'law_document' 317114 as the_url %}
35 | {% blocktrans trimmed with url=the_url %}
36 | When the law has integral text, we apply a special formatting to improve text's readability, and
37 | allow to refer to specific articles (e.g. see Portaria 117/2014).
38 | {% endblocktrans %}
39 |
40 |
{% trans "Analysis to the law"%}
41 |
42 | {% url 'law_analysis_selector' 3 'law_types_time_series' as the_url %}
43 | {% blocktrans trimmed with url=the_url %}
44 | Besides law, we also present some statistical features of it.
45 | One example is the evolution of the portuguese laws:
46 | {% endblocktrans %}
47 |
48 | {% url 'law_data_selector' 'law-types-time-series-json' as the_url %}
49 | {% include "law/analysis/law_types_time_series/graph.html" with url=the_url%}
50 |
{% trans "Other information:" %}
51 |
52 | {% trans "This project is in continuous deployment; we use Twitter to announce updates." %}
53 |
54 |
55 | {% blocktrans trimmed with tretas_url="http://dre.tretas.org" %}
56 | We thank the contributors of dre.tretas and in particular to
57 | Helder Guerreiro, for ideas and discussions that greatly improved this project.
58 | {% endblocktrans %}
59 |
60 |
61 | {% endblock %}
62 |
--------------------------------------------------------------------------------
/law/composer.py:
--------------------------------------------------------------------------------
1 | from django.db.models import Q
2 | from django.core.cache import cache, caches, InvalidCacheBackendError
3 |
4 | from pt_law_parser import analyse, common_managers, observers, ObserverManager, \
5 | from_json, html_toc
6 |
7 | from law.models import Document, Type
8 |
9 |
10 | PLURALS = {'Decreto-Lei': ['Decretos-Leis', 'Decretos-Lei'],
11 | 'Lei': ['Leis'],
12 | 'Portaria': ['Portarias']}
13 |
14 | SINGULARS = {'Decretos-Leis': 'Decreto-Lei',
15 | 'Decretos-Lei': 'Decreto-Lei',
16 | 'Leis': 'Lei',
17 | 'Portarias': 'Portaria'}
18 |
19 |
20 | def get_references(document, analysis=None):
21 | if analysis is None:
22 | analysis = text_analysis(document)
23 |
24 | query = Q()
25 | for name, number in analysis.get_doc_refs():
26 | type_name = name
27 | if name in SINGULARS:
28 | type_name = SINGULARS[name]
29 | query |= Q(type__name=type_name, number=number)
30 |
31 | return Document.objects.exclude(dr_series='II').filter(query)\
32 | .exclude(id=document.id).prefetch_related('type')
33 |
34 |
35 | def _text_analysis(document):
36 | type_names = list(Type.objects.exclude(name__contains='(')
37 | .exclude(dr_series='II').values_list('name', flat=True))
38 | type_names += [plural for name in type_names if name in PLURALS
39 | for plural in PLURALS[name]]
40 |
41 | managers = common_managers + [
42 | ObserverManager(dict((name, observers.DocumentRefObserver)
43 | for name in type_names))]
44 |
45 | terms = {' ', '.', ',', '\n', 'n.os', '«', '»'}
46 | for manager in managers:
47 | terms |= manager.terms
48 |
49 | analysis = analyse(document.text, managers, terms)
50 |
51 | docs = get_references(document, analysis)
52 |
53 | mapping = {}
54 | for doc in docs:
55 | type_name = doc.type.name
56 | if doc.type.name in PLURALS:
57 | for plural in PLURALS[doc.type.name]:
58 | mapping[(plural, doc.number)] = doc.get_absolute_url()
59 | mapping[(type_name, doc.number)] = doc.get_absolute_url()
60 |
61 | analysis.set_doc_refs(mapping)
62 |
63 | return analysis
64 |
65 |
66 | def text_analysis(document):
67 | """
68 | Cached version of `_text_analysis`. Uses cache `law_analysis` to store
69 | the result.
70 | """
71 | # short-circuit if no caching present
72 | try:
73 | cache = caches['law_analysis']
74 | except InvalidCacheBackendError:
75 | return _text_analysis(document)
76 |
77 | key = 'text_analysis>%d' % document.dre_doc_id
78 | result = cache.get(key)
79 | if result is None:
80 | result = _text_analysis(document)
81 | cache.set(key, result.as_json())
82 | else:
83 | result = from_json(result)
84 |
85 | return result
86 |
87 |
88 | def compose_all(document):
89 | key = 'compose_all>%d>%d' % (1, document.dre_doc_id)
90 | result = cache.get(key)
91 | if result is None:
92 | result = text_analysis(document)
93 | result = (result.as_html(), html_toc(result).as_html())
94 | cache.set(key, result)
95 | return result
96 |
--------------------------------------------------------------------------------
/main/templates/base/top_header.html:
--------------------------------------------------------------------------------
1 | {% load i18n %}
2 | {% load static %}
3 |
66 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | [](https://travis-ci.org/jorgecarleitao/public-contracts)
2 | [](https://coveralls.io/github/jorgecarleitao/public-contracts?branch=master)
3 |
4 | # Publicos.pt
5 |
6 | Publicos.pt is an open source Django website and API to query and analyse data of
7 | the portuguese state. Thanks for checking it out.
8 |
9 | ## The problem we aim to solve
10 |
11 | This website aims to:
12 |
13 | 1. Provide a consistent way to query portuguese public data using Django ORM
14 | 2. Interrelate different public data
15 | 3. Extract and present statistical features of the data
16 |
17 | and consists in 4 components:
18 |
19 | 1. A set of crawlers and validators that retrieve information from official databases and store them in Django models
20 | 2. A set of Django models (in Django Apps) to store the data in a database
21 | 3. A database with read access to anyone so anyone can query it by git-cloning this code.
22 | 4. A [website](http://publicos.pt) that uses the above points to provide some statistical features of the databases
23 |
24 | ## Scope
25 |
26 | We focus on three aspects of the portuguese state:
27 |
28 | 1. Public Contracts: **contracts** between **entities** with a **value** and other fields.
29 | 2. Members of the Parliament: **Persons** that have **mandates** in **legislatures** of the parliament.
30 | 3. Laws: **documents** that are officially published as laws.
31 |
32 | ## The code
33 |
34 | We use [Django](https://www.djangoproject.com/) ORM for the API and database
35 | and d3.js for visualisations of statistical quantities of the database.
36 | The official website is written in English and translated to portuguese (via i18n), also hosted here.
37 |
38 | The code is licenced under BSD.
39 |
40 | ## Documentation
41 |
42 | The API and the crawler are [documented](http://public-contracts.readthedocs.org/en/latest/).
43 |
44 | ## Pre-requisites and installation
45 |
46 | The installation depend on what you want to do:
47 |
48 | ### Access and query the database
49 |
50 | 1- Install Django, psycopg2 and [django-treebeard](https://github.com/tabo/django-treebeard):
51 |
52 | `pip install -r api_requirements.txt`
53 |
54 | 2- Download the source
55 |
56 | `git clone git@github.com:jorgecarleitao/public-contracts.git`
57 |
58 | 3- Enter in the project's directory:
59 |
60 | `cd public-contracts`
61 |
62 | 4- Run the example:
63 |
64 | `python -m contracts.tools.example`
65 |
66 | you should see two numbers. You now have full access to the database.
67 |
68 | ### Deploy the website locally
69 |
70 | 1- Install the requirements:
71 |
72 | `pip install -r website_requirements.txt`
73 |
74 | 2- Start the server:
75 |
76 | `python manage.py runserver`
77 |
78 | 3- Enter in the url `127.0.0.1:8000`.
79 |
80 | You should see the website as it is in [http://publicos.pt](http://publicos.pt).
81 | Some plots will not be displayed right away because they take some time to be
82 | computed, but `127.0.0.1:8000/contratos` should show the latest contracts.
83 |
84 | ### Crawl the official sources
85 |
86 | To be able to use crawlers, you need to install two extra dependencies: [requests](http://docs.python-requests.org/en/latest/)
87 | and [pt-law-downloader](https://github.com/publicos-pt/pt_law_downloader):
88 |
89 | `pip install -r production_requirements.txt`
90 |
91 | If something went wrong, please report an [issue](https://github.com/jorgecarleitao/public-contracts/issues)
92 | so we can help you and improve these instructions.
93 |
--------------------------------------------------------------------------------
/law/templates/law/document_view/main.html:
--------------------------------------------------------------------------------
1 | {% extends "law/base.html" %}
2 | {% load i18n %}
3 | {% block title %}{{ law.type.name }} {% if law.number %}{{ law.number }}{% endif %}{% endblock %}
4 | {% block content %}
5 |
6 |