Build

This phase is about creating a new IT system, or updating an existing IT system, to implement your mapping and publish OCDS data.

Alternatively, if you don't have the capacity to create or update an IT system, you can consider reusing an existing data collection tool. If you're reusing an existing tool, this phase is about customizing that tool to meet your needs and working out how to combine and publish your data. The Data Support Team can help you to consider options for collecting, combining and publishing data.

As you complete this phase, you can:

Register an OCID prefix

In OCDS, the contracting process identifier (ocid) uniquely identifies a contracting process. As a publisher, you will assign a unique ocid to each contracting process.

To ensure that your ocids do not conflict with those of another publisher, you need to register an ocid prefix.

Action: Email data@open-contracting.org to request an OCID prefix. Provide the name of the publishing organization and the email address of a contact person at this organization.

Resource: To learn more about the ocid and its prefixes, refer to the identifiers reference.

Note

All registered OCID prefixes are accessible as a web page or CSV file.

Determine your system architecture

There are many ways to extract data from data sources, combine it, map it to OCDS, and publish it. The system architectures guidance page describes some possible approaches.

Your choice of architecture can determine how frequently your data is updated, whether you can publish a change history and the access methods available to your users. Remember to check that your chosen architecture meets the needs you identified in the design stage.

Resource: Technical case studies: OCDS implementation insights report provides insights into the technical choices made in OCDS implementations in Paraguay, Zambia, Colombia, Moldova and Argentina's Road Agency Vialidad.

Decide how to combine spreadsheet data

If you aren't creating or updating an IT system, but are instead collecting data from different individuals, departments or agencies using spreadsheets, then this step is about working out how to combine your data into a single file for publication. Combining your data makes it easier for users to analyze the whole dataset.

If you plan to publish your data infrequently, you only have a small number of spreadsheets and your spreadsheets have identical headers, then simply copy-pasting the data into a single file for publication may be the easiest method.

Otherwise, you can consider the following methods:

  • If you're comfortable using a command-line interface, you can use CSV Kit's in2csv command to convert each sheet of a spreadsheet into a CSV file, and then use the csvstack command to combine sets of CSV files with identical headers into single CSV files.

  • If you're comfortable writing Visual Basic for Applications (VBA) or Google Apps Script code, you can write a macro for Microsoft Excel or Google Sheets to combine your data into a single file.

  • If you're comfortable using spreadsheet formulae, you can use Google Sheet's IMPORTRANGE or QUERY functions to import data from multiple spreadsheets to a single sheet.

  • If you aren't comfortable with the above methods, you can consider using a spreadsheet add-on for combining data from multiple sheets.

Establish your publication formats and access methods

OCDS data can be published in different formats and accessed using different methods.

It is best practice to provide data in multiple formats, so that as many users as possible can use the data without first having to transform it to their preferred format. In OCDS terms, this means publishing both structured JSON data and tabular CSV or spreadsheet data.

Where resources allow, it is also best practice to provide multiple access methods for your data, so that both humans and machines can access it easily. In OCDS terms, this means providing both bulk downloads and an API. The OCDS pagination extension describes how to paginate OCDS data via an API.

Remember to check that your chosen publication formats and access methods meet the needs you identified at the design stage.

Tool: Flatten-tool can be used to convert OCDS data between JSON and CSV/spreadsheet formats.

Build your data pipeline

Having determined your system architecture, it's time to implement it. This is one of the longest steps of implementing OCDS.

Whether your current infrastructure is low tech or high tech, we have tools and resources to help you publish OCDS. Depending on your data sources and system architecture, you might be able to reuse some of these OCDS tools:

Note

If you have any issues using OCDS tools, contact the Data Support Team.

  • If you are creating (or upgrading) an electronic government procurement (e-GP) system or open contracting data portal, refer to our Guide to Defining OCDS Functional Requirements for e-GP Systems.

  • If your source data is in CSV/Excel files, you can rename the columns to match the JSON paths in OCDS (for example, buyer/name) and then transform the CSV/Excel files to OCDS JSON by using Flatten Tool, a command-line tool.

  • If your source data is in Excel files, you can alternately transform Excel files to OCDS JSON by using the Open Contracting Explorer, which includes a web interface and web API for users to access and explore the OCDS data. (This tool is authored by Development Gateway.)

  • If your source data is in SQL tables, you can use Kavure'i to transform it to OCDS. To use it, you write SQL queries to extract data from SQL tables, setting the columns for the query results to match the JSON paths in OCDS (for example, buyer/name). The query results are saved to CSV files, which are transformed to OCDS JSON using Flatten Tool. (Kavure'i is authored by Paraguay's Dirección Nacional de Contrataciones Públicas (DNCP).)

  • To make OCDS data available via an API, you can use another component of Kavure'i to load OCDS data into ElasticSearch, and then use Pitogüé to make it available via an API. (Both tools are authored by Paraguay's Dirección Nacional de Contrataciones Públicas (DNCP).)

  • If you intend to publish record packages, OCDS Merge is the best software library for creating OCDS records. If you use the Python programming language, you can use it directly. If not, you can use its test cases to test your implementation of the merge routine, and you can read its commented code as inspiration for your implementation.

  • If you have release packages and want to have record packages, if you have data that follows an older version of OCDS, or if you otherwise need to transform your OCDS data, you can use OCDS Kit as a command-line tool or Python library.

If you aren't creating or updating an IT system, but are instead reusing an existing data collection tool, you can customize it:

  • The data collection spreadsheet includes instructions describing how to add fields and how to add and reformat sheets.

  • The data collection form includes instructions describing how to add fields and how to customize descriptions and guidance.

Contact the Data Support Team for guidance on customizing a tool to meet your needs.

Resource: Using tabular versions of OCDS to generate JSON data details the approach used in Paraguay.

Resource: To learn about how to create a spreadsheet input template for OCDS, check out our blog series on prototyping OCDS data using spreadsheets (Part 1, Part 2, Part 3).

Note

Re-using tools isn't always easy. Tool Re-Use in Open Contracting: A Primer is a step-by-step guide to help you determine what you need, evaluate which tool is the right fit, and evaluate whether the right conditions are in place for successful re-use of a tool.

Build your extensions

If your mapping identified data elements which don't map to OCDS or an existing extension, you ought to develop your own extensions. Documenting your additional fields using extensions makes important information about the structure, format and meaning of your data available to users.

Action: Read the guidance on developing new extensions, which includes links to useful tools and resources.

Action: Request assistance from the Data Support Team to model your extensions.

Action: Share your extensions with the OCDS community on GitHub.

Resource: Webinar: Creating OCDS Extensions (presentation)

Keep users in mind as you build

As covered in the Design phase, different users will need information in different ways. Some will need bulk downloads, some will need APIs, some will need CSVs, most will need change history published on a timely basis with individual releases and records.

Resource: Guidance on bulk downloads, APIs, individual releases and records, and flattened serializations

Resource: Guidance on JSON and CSV serialization, including packaging files with metadata

Check your data

Throughout the build phase you ought to regularly use the OCDS Data Review Tool to check the structure and format of your data. This ensures that your data is compatible with OCDS tools and is comparable with other OCDS data.

OCDS data needs to be published as part of a release package or a record package. You can use OCDS Kit to reformat your data before submitting it to the review tool, but any data you publish needs to be correctly packaged.

The Data Review Tool reports any structural issues with your data. It checks whether your data makes sense and displays a preview of your data, so that you can check whether the information is appearing in the correct place within the schema.

You ought to use real data for testing, wherever possible. Using fictional data can lead to false positives and missed errors in your data pipeline: for example, if your test data includes incoherent values for the award date and the contract signature date, it won't be possible to identify issues with how these fields are mapped in your OCDS data.

If your data source doesn't contain any data yet, because you are developing a new system to collect and publish data, for example, then you ought to work with stakeholders to collect enough real data to populate all the data elements for at least one contracting process.

If you can't collect enough real data for testing, then you ought to create realistic and coherent test data:

  • use real entities, products, and services

  • use plausible dates and values

  • avoid using placeholder values

  • avoid setting multiple data elements to the same value.

Action: Upload some data to the OCDS Data Review Tool.

Action: Request feedback on your draft data from the Data Support Team.

Tool: The jOCDS Validator can be used for bulk checking of the structure and format of OCDS data

Resource: How to check your OCDS data validates

Next phase: Publish