January 2018 Policy Updates from the NYC Open Data team
In 2017, the Open Data program enjoyed the spotlight at three City Council hearings as lawmakers, advocates, and the de Blasio administration worked together to craft new legislation to sustain the open data program into perpetuity. In addition, the Open Data program implemented new data quality and documentation policies to comply with previous amendments to the Open Data Law. Below is a summary of these updates.
Improvements to Data Quality
2017 marked the beginning of a holistic “clean-up” of the Open Data Portal’s dataset inventory. To create a better user experience, we have begun to remove certain datasets, improve the search function, standardize geospatial fields across datasets, and document each dataset’s metadata in data dictionaries.
Dataset Removal and Improvements to Search
Our new dataset removal policy applies to data that does not qualify as a “public dataset” according to the Open Data Law.
A dataset will be removed from public access when the agency owner and the NYC Open Data team agree that it does not legally qualify under the law. The dataset will then be listed in the public “Dataset Removals” dataset, which contains its name, agency, hyperlink, and reason for removal. A copy of the dataset will be retained for three months after it is removed from public view, after which point the data will be permanently archived. In this three month “grace” period, a user may object to the removal of a dataset by contacting the Open Data team at opendata.cityofnewyork.com/engage. We will consult with data owners and public records officers at the relevant agency before making a final determination on removing the dataset.
In general, datasets considered for removal are infrequently accessed by public users, not regularly updated, and not actively maintained by the agency data owner. Users have complained that these datasets “clutter” the catalog. Their removal will make it easier to search for and find relevant, high-quality datasets.
In addition, beginning this spring, user-created “Community” views will be removed from the search function on the Open Data catalog. Community views are created when a user filters or visualizes an “Official” dataset from a City agency and saves it to their NYC Open Data account. Note that your existing community filters will still be accessible at the same links -- they will just not appear when using the platform’s search function.
Geocoding Street Addresses
To make it easier for datasets with geospatial fields to be compared or combined, we standardized geospatial attributes associated with street addresses. Every dataset containing a street address is now required to also include fields with its latitude and longitude, neighborhood details, political districts, and other fields. Additional details on this standard can be found in section 4.1.1.2 of the Technical Standards Manual.
DoITT leveraged the Department of City Planning’s GeoSupport geocoding tool, which pairs columns containing street address data with data attributes required by the new standard. Users may also use the GeoSupport tool to geocode datasets themselves through the Geoclient API. Geoclient is a RESTful web service interface to the Geosupport system developed by DoITT’s GIS/Mapping unit.
A record of datasets eligible for the geocoding standard, along with datasets that have already been geocoded, is maintained in the 2017 NYC Open Data Plan - Address Standardization dataset. Most eligible datasets have already been standardized. Currently, DoITT is adding geocoding to the “automation” workflows for datasets that are automatically updated.
Data Dictionaries
After a thorough effort to document definitions for data fields for all datasets last year, most datasets now have data dictionaries. The data dictionary not only provides definitions on data attributes but also gives context on how and why the data is collected. You can track which data dictionaries are still in progress on the Data Dictionary Compliance Public Assets dataset. If you find a data dictionary that could use further clarification, tell us about it. We will follow up with the agency and let them know.
New Legislation
In December, Local Law 244 of 2017 and Local Law 251 of 2017 became law, extending the duration of the Open Data Law and creating new annual reporting requirements. We would like to thank the Committee on Technology and Chair Vacca for their unwavering support of the Open Data initiative. The Open Data policy that was borne from the Law is unparalleled among American municipalities, and the amendments the Committee has passed over the last three years will ensure that the program thrives into the future.
Extension of Open Data Mandate
The Open Data Law requires City agencies to publish all public datasets by December 31, 2018. New legislation requires datasets created after this deadline to be published, extending the Open Data mandate into perpetuity. Agencies that have already identified datasets in their Open Data plans are still required to publish them by the end of this year.
Technical Standards Manual
The Technical Standards Manual is the foundational document for the Open Data program, containing information on technical specifications and policy. New legislation requires that every two years, we conduct a thorough review and update the document. The update will begin late this year and include community feedback.
Agency open data coordinator
The head of each City agency is now required to officially name one employee as its Open Data Coordinator, the agency’s main liaison for data publishing and responding to public inquiries about that agency’s datasets. While almost every City agency already has an existing open data coordinator, the addition of ODCs to the administrative code will help ensure the role has sufficient visibility and resources from agency leadership.
Web portal site analytics
Later this year, we will publish information on user traffic, including numbers on pageviews and users who access the portal. This information is already included at data.cityofnewyork.us/analytics, and the Open Data team will make usage data more easily understandable this year.
Annual Open Data Plan
For the past four years, the annual Open Data plan has been published each July 15th. Going forward, the plan will be published every September 15th to align Open Data compliance reporting requirements with other reporting requirements centered around the City’s fiscal year (July 1 - June 30).
Starting this year, the Open Data plan will include comprehensive information on each dataset on the open data portal, including the dataset’s:
1. Scheduled publication date
2. Actual publication date
3. Most recent update date
4. URL
5. Whether it complies data retention standard (which mandates that row-level data be maintained on the dataset)
6. Whether it has a data dictionary
7. Whether it meets the geocoding standard, does not meet the geocoding, or is ineligible for the geospatial standard
8. Whether updates to the dataset are automated;
9. Whether updates to the dataset “feasibly can be automated,” and if not, a reason why
In addition, agencies will now list the names of datasets that agency records officers use to respond to public records requests. This builds on previous legislation that required agencies to report metrics on datasets used to respond to FOIL requests in their annual compliance plans and helps ensure that Open Data Coordinators work closely with their legal affairs and public records personnel.
MODA Examination and Verification
Each year, the Mayor’s Office of Data Analytics is required by Local Law 8 of 2016 to conduct an Open Data Examination and Verification (E&V) of three city agencies. The purpose of the process is twofold: it allows MODA to critically examine three specific City agencies’ data inventories and also holds up a mirror to the NYC Open Data program at large.
In 2016, MODA examined Department of Sanitation (DSNY), Department of Correction (DOC), and Department of Housing Preservation and Development (HPD). On December 1, 2017, MODA released the results of the 2017 E&V cycle, which examined the Department of Environmental Protection (DEP), the Fire Department (FDNY), and the Department of Buildings (DOB). The results of the examination and verification, as well as MODA’s recommendations on how to improve citywide compliance with the Open Data Law, can be found in the “Reports” section of the Open Data site.
The 2018 Examination and Verification report will cover the Business Integrity Commission (BIC), Department of Small Business Services (SBS), and Department of Transportation, and will be released on December 1, 2018.








