Open data

Does any law require D.C. agencies to open their data?

Not yet. Some cities have such laws (New York City, San Francisco, etc.). Federal agencies took steps as well to open data under Office of Management and  Budget direction in the Obama administration. Open federal data has also become a legal mandate through the Open, Public, Electronic, and Necessary (OPEN) Government Data Act, included as Title II of the Foundations for Evidence-Based Policymaking Act (H.R. 4174) effective in January 2020.

Assuring openness by law in the District has been under discussion among officials and advocates. The D.C. Open Government Coalition worked with others to develop open data provisions included in the Strengthening Government Transparency Amendment Act introduced in the D.C. Council in March 2017 (Bill B22-0188) but never considered.

Meanwhile D.C. government data are open by policy set in a directive of the D.C. mayor.

So what is the open data policy of the D.C. government?

Mayor Muriel Bowser issued Order 2017-115 in April 2017 for the first time declaring that

“the greatest value from the District’s investment in data can only be realized when enterprise datasets are freely shared among District agencies, with federal and regional governments, and with the public to the fullest extent consistent with safety, privacy, and security. “Shared” means that enterprise datasets shall be:

1. Open by default, meaning their existence will be publicly acknowledged, and further, if enterprise datasets are not shared, an explanation for restricting access will be publicly provided;
2. Published online and made available to all at no cost;
3. Discoverable and accessible;
4. Documented;
5. As complete as can be shared;
6. Timely;
7. Unencumbered by license restrictions; and
8. Available in common, non-proprietary, machine-readable formats that promote analysis and reuse.”

What progress has been made locating and opening data?

D.C. government agencies covered must inventory their data and make much of it public. The citywide list is to be updated annually and published in March coincident with Sunshine Week. The annual report of the Chief Data Officer, also issued in March, gives further details.

The March 2019 report says that as of December 10, 2018, 75 agencies recorded 1,779 enterprise datasets. The 75 agencies include most mayoral agencies (64 of 69) but fewer independent agencies (11 of 33). (The nonresponding agencies are not identified.) Unlike mayoral agencies, independent agencies such as housing and transit authorities governed by separate boards and commissions are not bound by mayoral orders but were “strongly encouraged to voluntarily comply.”

Detailed facts of all 1,779 are available in various formats.

Agencies next were required to classify each dataset by sensitivity – using a five-level system from zero to four. Agencies reported they hold 793 datasets assessed as Level 0 in sensitivity, defined as “open,” able to be proactively released to the public.

Release is apparently a work in progress—one quarter are not yet posted. The 577 released are on the D.C. Open Data Portal.  

The staff responsible for the open data program maintains a site for discussion of requests and problems at GitHub.

What data are not yet public?

For the 216 of the 793 non-sensitive Level 0 datasets not yet released, the annual report offers no explanation. It does promise that there will be new work “to prioritize posting these remaining Level 0 datasets.”

Those not released are listed in a special table on the open data website, at least up to March 2018. According to the report, the Office of the City Administrator will be tracking agency performance in publishing the Level 0 datasets.

The remaining 986 datasets are defined by policy as not publicly available. They fall into four levels of increasing sensitivity:

Sensitivity level Definition Number of datasets (as of December 10, 2018)
1 Public but not proactively released because of concerns over safety, privacy, security or legal issues 144
2 Not highly sensitive but subject to a FOIA exemption and nonpublic, for internal government use only 188
3 Confidential, protected by law — especially education and health records 567
4 Restricted confidential, release could cause significant damage or injury to persons or impair agency ability to do its work 87

Agencies decide the classifications. A handbook gives a blizzard of details on data elements but very limited elaboration on the categories. The annual report does not discuss accountability and quality control. As with any secrecy scheme, over-classification by cautious officials is a constant threat. The public will be asking whether the Chief Data Officer reviews agency dataset classification decisions to check validity and reliability — that is, whether they accurately reflect the data and correctly apply the level definitions. See below for some early evidence of misclassification drawn from instances of FOIA release of nonpublished datasets. Also of interest is whether the classifications are applied consistently across agencies. For example, again from FOIA experience, the Coalition believes the Metropolitan Police Department more than other agencies withholds records incorrectly based on an overbroad definition of invasion of privacy.

If data have been overlooked or incorrectly classified, what can the public do?

The mayor’s 2017 order denies that it creates any new rights or that the order may be enforced in court.

It provides no administrative procedure (within the government) for users to appeal decisions affecting open data.

It does say the Chief Data Officer is responsible for “receiving and responding to public input regarding the District’s data policy and activities.” But agency data officials responsible for their own inventory and classification have no assigned responsibility to the public. Accountability is only through the normal chain of command from elected officials to appointed executives and career staff.

The D.C. Open Government Coalition is interested to hear from data users who find problems with the inventory or classification and have not been able to resolve them with government officials.

Can I still use D.C. open records law to get D.C. data?   

Yes. Data have always been considered electronic records available by request under the D.C. Freedom of Information Act and subject to the act’s deadlines, exemptions, fees, appeal process and ultimately the right to sue to challenge agency errors. The open data policy does not change that.  

The mayor’s 2017 order recognized useful interplay between FOIA and the new proactive data publication system being established:

“On the one hand, FOIA request-tracking data should inform public bodies about the demand for and priority of publishing certain datasets or derivatives of those datasets as Level 0, Open. Similarly, successful appeals for datasets previously denied under FOIA exemptions can inform public bodies about potential errors in dataset classification. On the other hand, publication of FOIA request-tracking data can help residents hold public bodies accountable for the timely and consistent processing of requests.”

For data not published on the data portal, the order thus recognized that a FOIA request is in essence a test of the mayor’s classification system that has placed almost 1,000 datasets in Levels 1-4 off-limits.

The 2019 annual report notes ten different examples in 2018 where data had to be released to requesters under FOIA that was not on the data portal of opendata.dc.gov. These included data on: police arrests, dockless bikes and scooters, surveys of charter school facilities, health inspections, health professional licenses, moving violations by city vehicles, vacant property determinations, and more.

Tracking FOIA requests for analysis is hindered by the fact, disclosed for the first time in the 2019 annual report, that years after the public online request portal system was implemented it is not fully used: it accepts requests for only 55 D.C. agencies and a quarter of requesters avoid it. Only 76 percent of 10,450 FOIA requests in fiscal year 2018 were captured in the system database.

The portal is proprietary software known by the trade name FOIAXpress, developed for federal agencies and purchased from its developer AINS. The D.C. Open Government Coalition called for a review of the system in testimony on behalf of dissatisfied public users at an oversight hearing of the D.C. Council in February 2019.

The FOIA statute requires several reports each year on request processing, appeals and litigation, submitted by the mayor and attorney general to the D.C. Council. They are in the form of pdf pages, archived on the website of the Office of the Secretary. 

As the policy called for, the 2019 open data annual report includes a goal to “begin publishing the District’s Annual FOIA Reports as open data in addition to PDF format so that the reports can be more easily analyzed.” The D.C. Open Government Coalition has for years called for improved FOIA reporting.

Are D.C. data available in any other way?

Yes. In addition to the published data on the D.C. government Open Data Portal discussed above, and requests to individual agencies via D.C. FOIA, others make data available.

The local brigade of the national nonprofit Code for America has used structured datasets from many sources in the District in projects in recent years. They have also tested data availability by submitting FOIA requests. The Code for D.C. website includes links to over 500 archived datasets from past projects.

For an example of creative use of open data from Washington Metro Area Transit Authority sources, explained in detailed methodology notes, see the report by MetroHero, Metrobus Report Card: Grading the Performance of Buses in the D.C. Priority Corridor in May 2019.