Making European Tender Data Explorable for Non-Experts (archived monorepo) https://tedective.org
This repository has been archived on 2023-08-14. You can view files and clone it, but cannot push or open issues or pull requests.
Go to file
Linus Sehn 5fd5779f0e
continuous-integration/drone/push Build is failing Details
Sign drone.yml
2023-08-14 17:01:28 +02:00
.reuse Add LICENSES 2022-10-20 08:52:55 +02:00
.vscode fix: reuse compilance 2023-07-21 00:40:40 +02:00
LICENSES Add Apache-2.0 license text 2023-07-05 16:10:19 +02:00
api update: commented out the not working test 2023-08-14 15:15:29 +02:00
ui Account for organization role 2023-04-05 11:29:43 +02:00
website Remove links 2023-08-14 16:58:45 +02:00
.drone.yml Sign drone.yml 2023-08-14 17:01:28 +02:00
.gitignore Update .gitignore 2023-03-30 15:00:37 +02:00
.pre-commit-config.yaml fix: reuse compilance 2023-07-21 00:40:40 +02:00
Caddyfile.tdb feature: api can now be deployed on development server 2023-07-25 18:28:29 +02:00
Caddyfile.tdb.license update: first draft of graph endpoints tests 2023-08-07 13:57:12 +02:00
Makefile update: first draft of graph endpoints tests 2023-08-07 13:57:12 +02:00
README.md update: documentation and adjusted compose files 2023-08-07 11:03:59 +02:00
docker-compose.production.yml Disable most components 2023-08-14 16:54:10 +02:00
docker-compose.tdb.yml update: compose file 2023-08-14 09:52:41 +02:00
logo.png Crop logo 2023-02-01 17:30:43 +01:00
logo.png.license Some README changes 2023-02-01 17:29:41 +01:00

README.md

Build Status REUSE status Black

Please note: This project is in an alpha stage and many things are subject to change. But we are getting there. Expect a workable beta-stage prototype towards the end of Q2 2023.

TEDective Logo

What is TEDective?

tl;dr TEDective helps you find out who buys what from whom in the EU.

A related aim is to bring is to bring the XML files published by the European Union into shape by transforming them to the Open Contracting Data Standard.

Despite a range of previous efforts to parse and analyse TED data, there currently exists no offering that fulfils all of the following requirements with regards to the provision of TED data:

  1. It is built and published under a free software license.
  2. It offers a current, cleaned and deduplicated version of TED data.
  3. Data is available as OCDS-compliant JSON, both as bulk downloads and via a capable and well-documented API.

Sustainably providing long-term access to European tender data in a way that fulfils these three requirements enables numerous applications that might be of interest to civil society, business and government which could greatly enhance the transparency and accountability of European business activity. There are a range of interesting questions that can be answered with this data if it was available in a well-documented and easy-to-understand format that is interoperable with tender data published elsewhere.

What's inside

TED XML notices are downloaded and parsed into PostgreSQL (and possibly later a graph database) using the amazing SQLModel. An API built with FastAPI sits in front of this database and provides streamlined access to OCDS entities, such as organizations, releases or contracts. The code that does the parsing and serves the API can be found in the ./api folder.

The second component of this monorepo is the React-based UI, built with NextJS, Chakra UI and react-force-graph. All the relevant code for this component lives in the ./ui folder.

Finally, there is the project landing page and documentation built with Docusaurus. The relevant code lives in the ./website folder.

Development

To start developing locally, please consult the documentation or simply run:

make help

Deployment

The production deployment is handled via the .drone.yml which simply executes the relevant docker-compose files, which currently are:

Apart from those, the API and database may be deployed locally with help of relevant docker-compose files:

Resources

There are a lot of resources that are helpful in developing and extending the parser. The most important ones are:

  1. The OCDS Schema reference which describes in detail what an OCDS release is and what it should look like
  2. The OCDS European Union Profile which maps fields in TED notices to fields in an OCDS release (Github, Internal Tooling).
  3. The OJS' Standard form for public procurement site lists all the forms used in any given TED notice.

Previous efforts

  • TheyBuyForYou (a project by "a consortium of 10 leading companies, universities, research centres, government departments and local authorities in the UK, Norway, Italy, Spain and Slovenia" funded by the EU Horizon 2020 programme. The project cost the EU around €3.3 million and was developed over two years until December 2020. It is now largely dysfunctional and out-of-date. Some code seems to be publicly available but is provided without an explicit license)

  • DigiWhist's opentender.eu (seems somewhat abandoned, repo is still lightly maintained. Data is updated less than once a month and the frontend code is not open-source. One of the DigiWhist researchers foudned TenderX, a private for-profit tender/company data offering)

    NB: This dataset seems to be used by the OCDS tool for scraping globally available OCDS data releases.

    NB: There was apparently some collboration between the OKFN and the lead researcher now behind TenderX (Mihály Fazekas, here is his personal website).

  • OpenTED (seems abandoned, last commit 2015; didn't work with OCDS as it wasn't developed at the time)

  • opented (very old attempt at parsing TED data that didn't turn out to work)

  • OpenTED Browser (an academic paper about )

  • ExtracTED (according to the README, this was used to parse data between 2014-2016; last commit 5 years ago)

  • eu-hack (last commit 15 months ago, author is a data scientist at Amazon and target format is CSV, I could not run his code and achieve an error-free parsing of more recent TED data)