├── DataFederationFramework.md ├── ISSUE_TEMPLATE ├── ISSUE_TEMPLATE.md └── ISSUE_TEMPLATE_CASE_STUDY.md ├── PreliminaryFindings.md ├── README.md ├── assets ├── 10x Data Federation Phase 4 pitch.pdf ├── Digital.gov Presentation — US Data Federation.pdf ├── Project-Overview-for-Partners-Stakeholders.pdf ├── Project-Overview-with-FNS-Case-Study.pdf ├── US-Data-Federation-Phase-II-Final.pdf └── US-Data-Federation-Project-Intro.pdf ├── summary.csv └── updates ├── README.md ├── update-01132020.md ├── update-01282020.md ├── update-02012019.md ├── update-02122020.md ├── update-02152019.md ├── update-02262020.md ├── update-03042019.md ├── update-03152019.md ├── update-03162020.md ├── update-03252020.md ├── update-04012019.md ├── update-04072020.md ├── update-04152019.md ├── update-04232020.md ├── update-04292019.md ├── update-05052020.md ├── update-05132019.md ├── update-05282019.md ├── update-06102019.md ├── update-07082019.md ├── update-10212019.md ├── update-11052019.md ├── update-11182019.md ├── update-11302018.md ├── update-12092019.md ├── update-12142018.md └── update-12232019.md /DataFederationFramework.md: -------------------------------------------------------------------------------- 1 | # The U.S. Data Federation Framework (alpha) 2 | 3 | ## Summary 4 | This project is a partnership between GSA's 18F and Office of Products and Platforms (OPP). We define a data federation as an effort where a certain type of specifiable data is collected across complex disparate organizational boundaries. This could be, for example, when the federal government collects data from states, or states from municipalities, or even when the Office of Management and Budget collects data systematically from many federal agencies. This type of data sharing effort is common in our distributed style of government, but we believe it has not been given the systemic investigation and infrastructural support it deserves — currently each such effort is treated as an isolated instance, with little sharing of tools or lessons from one effort to the next. 5 | 6 | The goal of the U.S. Data Federation project is to fill that gap, providing a common language and framework for understanding these efforts, and (in the future) a toolkit for accelerating the implementation of these efforts. 7 | 8 | ## What We Did 9 | We interviewed twelve leaders of federated data projects, and seven additional experts from government, academia, and the private sector who have been influential in shaping or understanding these efforts. We developed a maturity model for these efforts in along four axes: Impetus, Community, Specification, and Application. We found that the most successful efforts are those in which all four of these axes are developed iteratively and simultaneously. We also developed a playbook of common ways to ensure these efforts are successful. 10 | 11 | Projects We Interviewed: 12 | 13 | - [DATA ACT](https://en.wikipedia.org/wiki/Digital_Accountability_and_Transparency_Act_of_2014), a 2014 law mandating expanded and standardized reporting of financial data for federal agencies. 14 | - [data.gov.ie](https://data.gov.ie/data), the open data port for Ireland. 15 | - [data.gov](https://www.data.gov), the open data portal for the United States. 16 | - [code.gov](https://code.gov/), a searchable directory of open source code made public by the federal government. 17 | - [Voting Information Project](https://www.votinginfoproject.org/) helps voters find information about their elections with collaborative, open-source tools. 18 | - [Open Civic Data](http://opencivicdata.readthedocs.io/en/latest/index.html), an effort to define common schemas and provide tools for gathering information on government organizations, people, legislation, and events. 19 | - [opendataphilly.org](https://www.opendataphilly.org/), the open data portal for Philadelphia. 20 | - [OpenReferral](https://openreferral.org/), dedicated to developing open standards and platforms for making it easy to share and find information related to community resources. 21 | - [Open311](http://www.open311.org/), A collaborative model and open standard for civic issue tracking. 22 | - [National Information Exchange Model (NIEM)](https://www.niem.gov/), a common vocabulary that enables efficient information exchange across diverse public and private organizations. NIEM can save time and money by providing consistent, reusable data terms and definitions, and repeatable processes. 23 | - [Federal Geographic Data Committee](https://www.fgdc.gov/), an organized structure of Federal geospatial professionals and constituents that provide executive, managerial, and advisory direction and oversight for geospatial decisions and initiatives across the Federal government. 24 | 25 | We also researched the General Transit Feed Specification. 26 | 27 | Other Experts we Spoke With 28 | 29 | - Rachel Bloom, who researched open data standard adoption for her thesis. Currently at [GeoThink](http://geothink.ca/). 30 | - [Andrew Nicklin](https://govex.jhu.edu/author/anicklin/), Director of Data Practices at Johns Hopkins Center for Government Excellence 31 | - [James McKinney](http://www.jamespetermckinney.com/), Senior Data Standard Specialist at Open Contracting Partnership 32 | - [Data Coalition](https://www.datacoalition.org/), who advocates on behalf of the private sector and the public interest for the publication of government information as standardized, machine-readable data. 33 | - [Open Data Institute](https://theodi.org/), which connects, equips and inspires people around the world to innovate with data. 34 | - [Mark Headd](https://www.linkedin.com/in/markheadd), current 18F acquisitions specialist and former Philadelphia CDO. 35 | - [Wo Chang](https://www.nist.gov/people/wo-l-chang), Co-chair of the NIST Big Data Working Group. 36 | - Tyler Kleykamp, Chief Data Officer of Connecticut, who runs the [Connecticut Open Data Portal](https://data.ct.gov/) and maintains this [list](https://github.com/OpenDataCT/state-federal-datasets) 37 | of datasets states must report to the Federal government. 38 | 39 | Interviews ranged from 30-45 minutes. For project leaders I asked them (time permitting): 40 | 41 | 1. What is this effort, in your own words? 42 | 2. What was the impetus or driving force behind this effort? 43 | 3. In building this effort, what were the biggest challenges, and what went smoothly? 44 | 4. What tools and technologies were used for this effort? 45 | 5. Why did you choose this architecture or process? 46 | 6. What were the political and organizational dynamics of collecting this data? 47 | 7. Who were the relevant stakeholders for this effort, and how were they convened? 48 | 8. Is there anyone else we should speak with to better understand this effort? 49 | 50 | When interviewing experts, the interviews took a looser structure but we generally tried to ask them: 51 | 52 | 1. What experience do you have that might be relevant to a federated data effort? 53 | 2. What efforts are you aware of that fit into this category? 54 | 3. Do you have any contacts in those efforts we could reach out to? 55 | 4. What do you think are the primary challenges of federated data efforts? 56 | 57 | We took notes directly in a public github issue, and submitted it upon completion of the interview, sending it to the participants for review. You can view those raw notes [here](https://github.com/18F/data-federation-report/issues?utf8=%E2%9C%93&q=is%3Aissue+interview). 58 | 59 | We have also compiled a [summary of all projects interviewed](summary.csv). 60 | 61 | Our analysis consisted of analyzing our interview notes and identifying common themes and patterns, presented in two forms: a [Maturity Model](#the-data-federation-maturity-model) and [Playbook](#the-data-federation-playbook). The maturity model identifies four principal dimensions with which to gauge a federated effort, and three phases of maturity for each dimension. The playbook consists of a set of actionable plays which have proven helpful to previous efforts. 62 | 63 | ## The Data Federation Maturity Model 64 | 65 | Successful execution of a federated data effort is largely a question of incentives and resources. Developing, and complying with, a new process or specification for data submission takes considerable time, effort, and expertise, and will only be possible to sustain with a large number of motivated individuals who have both the ability and the capacity to execute on a long term vision. However, it is difficult, and unwise, to immediately allocate vast resources to a new federated data effort — the effort may be easier than anticipated, or harder, or impossible, for a variety of reasons. We have observed that the most successful of these efforts simultaneously and iteratively develop the maturity of the effort along four axes: Impetus, Community, Specification, and Application. If done properly, these four dimensions work in concert to create a virtuous cycle of more participation, enthusiasm, and resources allocated to the project over time. 66 | 67 | | Dimensions | Beginnings | Growth | Mature | 68 | |:----------------:|:--------------:|:----------------:|:-------------:| 69 | | **Impetus** | Grassroots | Policy | Law | 70 | | **Community** | First Adopters | Trending | Maintenance | 71 | | **Specification**| Dictionary | Machine Readable | Standardized | 72 | | **Application** | Alpha | Beta | Production | 73 | 74 | ### Impetus 75 | *The business or legal need that sets the project in motion* 76 | 77 | The first spark behind any federated data effort is some sort of impetus. This might be a shared understanding of a problem, where a grassroots community comes together and decides to act (for example, with the law enforcement community creating NIEM). Or it could be a policy, where a central authority (e.g., OMB, or a state government) decides to require compliance of a certain sort through an official policy (for example, the OMB M-13-13 policy with Data.gov). Or, it could be as formal as a law, such as in the case of the DATA Act. The more formal the mechanism, the greater force permanence it has, and thus the harder it is to adapt during implementation. 78 | 79 | For both policies and laws, it is critical to leave all technical decision making to the implementation team. Never specify a technical standard in policy or law. For example, it can be very helpful to dictate that a standard must be machine readable, but specifying that the standard must be written in XML can have complex and unforeseen consequences. In extreme cases you can specify data elements to provide, but it is best to avoid doing so. In writing a policy or law, focus on processes and outcomes, not on implementation details. Never attempt to fully dictate a data standard in policy or law. For example, you can say that the standard must be machine readable, extensible, and developed alongside a web application with user feedback, but it would not be helpful to say that the standard should be in XML, that states can modify as needed, and that it must be showcased in a user-friendly web application. It's also important to recognize that data standards need maintenance and adaptation over time: it's wise to specify an annual or biannual review of the standard itself to make sure the data is still providing value. 80 | 81 | ### Community 82 | *The people who provide and consume the data* 83 | 84 | For a federated data effort, you have two communities you need to keep happy: the data owners, or those who need to do the work to adopt the standard, and the community of users, or consumers of the aggregated data. It is the job of the project team (a small team dedicated to driving adoption) to keep this community excited and encouraged. 85 | 86 | When executing on a federated data effort, it's important to not expect or plan for immediate compliance of the entire community of data owners. Instead, select 2-3 early adopters: good-faith partners who are excited about helping. One partner is not enough to generalize on, and over 3 partners would begin to overburden the project team. Once those early adopters have been identified, help them implement a draft of the standard in a high touch fashion. This means working with them side by side to implement it, or even have the project team implement it themselves in order to demonstrate feasibility and be able to show them how it works. The project team's relationship with the early adopters is critical and bidirectional: the implementation details and the standard itself needs to be adapted to be user friendly for data owners, and those owners must also learn about why the standard is helpful to a broader audience. 87 | 88 | Typically data that is being collected is already in use by the communities most relevant to the data owners, and it requires a shift in mindset to invest in making it more broadly available. For example, for the Voting Information Project, counties typically already publish their polling locations and ballot information on their website. Without a standard reporting format, however, it was nearly impossible for citizens to find. Thus, the project team was responsible for conveying to the owners that problem of scale, and helping them understand the standard. And for the DATA Act, for example, the project team quickly realized that agencies were most comfortable working with CSV (comma separated value) format. Rather than try to teach them all to adhere to the XML-style format the machine readable specification was written in, they developed a parallel standard for the CSV format. 89 | 90 | Once the early adopters have had success, it's time to roll out the larger standard. Since the first-mover risk has been absorbed, typically the effort will start trending in the larger community, who now has use cases to point to, people to talk with, and code to look at to help them comply. Once adoption is complete or plateaus, the standard enters the maintenance phase, where ownership problems begin to be salient. Often the trending phase is accompanied by influxes of excitement, talent, press, and funding. After a few years all of those may lessen, and it becomes important to establish long term mechanisms for maintaining the standard itself and the processes around compliance. If the standard has become a normal part of operations, the maintenance phase can be part of the normal budget. If it continues to be "tacked on" to normal operations, the effort is at risk of fizzling out. 91 | 92 | It is also important to be in touch with the community of data consumers from the very beginning. This could be journalists, scientists, citizens, decision makers in the org, etc.. The reason for this is two-fold: first, as the ultimate users of the data, it is important that their feedback be integrated early and often. Second, their positive feedback and involvement will help incentivize the data owners, who are embarking on a long and uncertain journey. 93 | 94 | ### Specification 95 | *The definition of what data needs to be provided and what format it should be in* 96 | 97 | The specification itself is responsible for communicating to data owners what data needs to be provided and how it should be formatted. At the lowest level of formality, project teams can start with a description of fields (a data dictionary). These should be detailed, should not use acronyms, should include exact field / column names, and include examples of compliant data. If you have a simple data standard, for example a single CSV or sets of CSVs, this level of detail may be sufficient. For example, the General Transit Feed Specification, which is among the most successful federated data efforts, does not publish a machine readable standard, but rather has thorough documentation detailing the requirements and fields in language the domain experts in transit agencies would understand. 98 | 99 | A more formal specification should be machine readable: in this case, not only is the data dictionary well documented as detailed above, but the specification itself can be used to automatically perform simple validations against the data. This level of specificity can help reduce compliance burden by increasing clarity. For example, a data dictionary might have a field called "start_date" and describe that it's the starting date of an election, but a machine readable version would be forced to clarify that the format of the date should look like 2018-01-15, which reduces potential for wasteful back & forth or downstream data integrity problems. For specifications that must map to many-to-one or many-to-many relationships, a machine readable format (e.g., JSON Schema or XML Schema) is strongly preferred, even as a first iteration. That machine readable schema can be versioned as well. 100 | 101 | Over many years of stability and adoption, it may make sense to promote the machine readable schema into a full standard, recognized by a formal standards body, for example. This level of maturity is rare, but can be helpful to promote stability. Working with standards bodies is generally a complex endeavor, and thus not recommended until the standard has a proven user base and community. 102 | 103 | ### Application 104 | *A software application optimized for demonstrating and providing the value of the data to end users* 105 | 106 | It is only very rare cases where raw data published online will be compelling enough to drive adoption of the user community. Generally that community will have richer requirements beyond and supporting the data itself. For example, they may want to be able to search it or visualize it easily, or access it programmatically through an Application Programming Interface (API). Since that user community is ultimately driving value for the entire effort, it's important to be developing an application in concert with the standard to demonstrate the value of the data. For example, with code.gov, the project team developed an alpha version of code.gov in just a few months, showcasing the work of some of the early adopters. They quickly earned over 100,000 viewers, which helped ignite excitement for the effort across government, and provide countless valuable perspectives on how the data would be used and what, exactly, the value was they should be targeting. 107 | 108 | Often for federated data efforts, the value proposition itself is not fully known, but rather hypothesized: developing a "killer app" that showcases that value directly is a critical part of the effort. Similarly, the application helps inform the specification itself: perhaps some fields are found to be unnecessary, or incorrectly formatted, or missing. This application provides important fuel for incentivizing adoption. For example, for the General Transit Feed Specification, having local citizens be able to search through their city's public transit routes on Google Maps provided clear and tremendous value to potential adopters. As is industry best-practice, it's important to develop this application iteratively, starting with a small subset of features, releasing publicly to early adopters in a alpha phase, performing more extensive testing in a beta phase, and adding increased stability in the production phase. 109 | 110 | 111 | 112 | ## The Data Federation Playbook 113 | 114 | The following nine plays are drawn from our interviews with successful federated data efforts. They are intended to provide highlights of what project teams recall as being essential to the success of their project. It's unlikely that any effort will be able to execute on all plays, but if you are undertaking a new federated data effort, it will greatly improve your chance of success if you have a few of these plays in motion. 115 | 116 | **Policy Should Be Focused on Processes and Outcomes, Not Implementation**: The details of implementing an open data standard must be able to be adapted by project teams on the fly to meet the needs of the owner and user communities. Don't specify technical details in policy or law, instead focus on goals and processes used to achieve those goals (e.g., developed iteratively with user feedback, or must develop a machine-readable data standard). 117 | 118 | **Identify Use Cases With Demonstrated Demand**: When prioritizing whether or not to invest in a federated data effort, seek out ways to prove demand. Are citizens or media frequently requesting a certain type of data? Is there already a community built around fixing or cleaning up a certain type of data? Talk with your call centers and local communities. For example, with the Voting Information Project, Google reached out to the public sector because they had data indicating that people were searching for basic polling place and ballot data, but not finding it. 119 | 120 | **Develop a Killer App**: If your hypothesis is that your data is useful, do the work to build out the first use case and demonstrate that value. This will help win over the hearts and minds of participants. If you can't think of a single use case for your data, it might make sense to do more outreach with relevant communities to refine the value proposition before undertaking a larger effort. For example, the General Transit Feed Specification has become widely and enthusiastically adopted since it allows an agency's transit data to be displayed in Google Maps. 121 | 122 | **Allocate Proper Resources**: Throwing money at data owners without any structure will likely not get great results, but adding financial support for training, obtaining new software, and high-touch onboarding is very helpful. For example, data.gov.ie has an "open data unit" of 3-4 people whose sole purpose is to provide high-touch assistance to departments who want to provide data but need help navigating technical or procedural issues. And when the state of Connecticut wanted to normalize the financial data of its municipalities, it provided grants for obtaining new software and assisted with training, which built up goodwill and was of significant practical importance. 123 | 124 | **Implementation Should be Driven by a Single Empowered Team**: It's important to be able to deliver iterations on the application and specification quickly in order to build trust and momentum. Do not attempt to split responsibilities (e.g., the specification maintained by one department, and the application by another). 125 | 126 | **When possible, deliver value directly to data owners**: The teams responsible for complying with the policy bear the vast majority of the burden, yet often receive little benefit. The long term success of these efforts are often contingent on delivering value directly to these data owners. For example, in the state of Connecticut, municipalities readily submit their data to the open data portal, since it is the easiest way to publish and share data from one department to another, which is an operational necessity. And for the Philadelphia Open Data Portal, newly published data gets publicity in the form of a blog post, and gets visualizations layered on top of it. Government employees are very motivated by public service, and will be delighted to comply if they see the public engaging fruitfully with the data. 127 | 128 | **Start With Simple Technologies**: Just because a project might have tremendous value, or be complex organizationally, doesn't mean it needs to be complex at the technical level. For the DATA Act, for example, they found CSVs were the easiest format for owners to comply with, and also very easy to ingest and validate. The system behind the DATA Act, which was built on time for 1/10th of the CBO cost estimate, is architecturally just a simple web application. 129 | 130 | **Nurture Early Adopters**: Identify 2-3 early adopters, and work side by side with them to comply with a draft of the specification. Their success is critical to adoption — they will become much more convincing evangelists of your effort than you can be. They're also critical from a practical standpoint to allow for proper feedback and generalization, provide a forum to discuss challenges, and demonstrate technical feasibility. 131 | 132 | **Support Compliance Tooling**: In order to lessen the burden on data owners, it's important to do everything you can to help them compile and validate their data. For example, data.gov provides inventory.data.gov, a metadata inventory tool for agencies that easily exports the metadata to the required format. It also provides an online tool for validating a data.json file adheres to the specified format. Publicly accessible, human readable documentation is also a critical part of the success of these efforts. 133 | 134 | 135 | ## Next Steps 136 | Due to its decentralized nature, the U.S. Government often undertakes complex federated data efforts. It's time to collaborate more on what works, what doesn't, and ways to lower capital investment for these efforts in the future by sharing tooling and other resources. We hope to be shortly beginning a new phase of this project focused on identifying and building common tooling and answering critical questions around the long term ownership and maintenance of federated data efforts. 137 | 138 | ## Helpful Resources 139 | - [JSON Schema](http://json-schema.org/) 140 | - [Understanding JSON Schema (ebook)](https://spacetelescope.github.io/understanding-json-schema/) 141 | - [XML data types](https://www.w3.org/TR/xmlschema-2/#built-in-primitive-datatypes) 142 | - [Table Schema](https://frictionlessdata.io/specs/table-schema/) 143 | - [CKAN](https://ckan.org/) 144 | - [datastandards.directory](https://datastandards.directory/) 145 | - [State-Federal Datasets](https://github.com/OpenDataCT/state-federal-datasets) 146 | -------------------------------------------------------------------------------- /ISSUE_TEMPLATE/ISSUE_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | ## User story 2 | As a \, I want \ so that \. 3 | 4 | ## Acceptance criteria 5 | - [ ] 6 | - [ ] 7 | - [ ] 8 | -------------------------------------------------------------------------------- /ISSUE_TEMPLATE/ISSUE_TEMPLATE_CASE_STUDY.md: -------------------------------------------------------------------------------- 1 | ## Case study 2 | 3 | * Agency / office: 4 | * Contact: 5 | * Type of data: 6 | 7 | ### Overview of the opportunity 8 | 9 | -------------------------------------------------------------------------------- /PreliminaryFindings.md: -------------------------------------------------------------------------------- 1 | # Data Federation - Preliminary Findings 2 | 3 | ## Overview 4 | Federated data efforts — those in which a certain type of data is aggregated from many entities across organizational boundaries — have a long history and increasingly important role in critical missions across government. Due to government's decentralized nature, government leaders seeking to collect data frequently do so from organizations they have limited authority over — for example, OMB collecting data from agencies, federal government from states, states from municipalities, etc.. Special problems arise in these situations, since resources, capabilities, and incentives may vary wildly, yet there has been very little systemic support for these efforts. After speaking with leaders of nine such projects, and three additional experts, it is clear that such support is needed and would be welcomed by all levels of government. 5 | 6 | We recommend continuing on to a discovery phase, focusing on making capital investments in reusable tooling and processes that will benefit future federated data efforts. We also seek to clarify the federal governments role in these efforts moving forward. 7 | 8 | ## Why this matters 9 | Federated data efforts play an increasingly important role in laws, policies, and operations across many levels of government. When lawmakers on both sides of the aisles passed the DATA Act in 2014, it wasn't just innovative in its mission (to provide unprecedented transparency in government spending), but also in its methods, which required the creation of a machine-readable data standard and buy-in from every federal agency. The effort required coordination across federal agencies which varied wildly in technical capabilities, and was completed on time and well under budget in 2017. The federal open data policy, along with data.gov, provides a mechanism to streamline the cataloging and discovery of both open and closed data sets across the government. And Open311, a standard API for local government, has enabled an ecosystem of apps to develop around reporting problems such as potholes, streamlining the experience for citizens and cities alike. 10 | 11 | Federated data efforts are increasingly seen as an engine for transparency, economic growth, and accountability, yet collecting that data remains a challenge. Despite the fact that efforts of this sort are increasing in frequency, each new effort is still improvising solutions in terms of processes, tooling, and compliance infrastructure. It's time to take this problem seriously and invest in reusable tools and approaches that will streamline federated data efforts in the years to come. 12 | 13 | ## How we arrived at this conclusion 14 | We have performed 12 interviews so far with experts in these types of efforts, and will likely do a few more in the next couple weeks. We've spoken with: 15 | 16 | ### Projects 17 | - **Data.gov**: A catalog of metadata for most Federal datasets 18 | - **Code.gov**: A catalog of metadata for Federal open source code 19 | - **Open Referral**: An effort to provide data exchange formats for social services 20 | - **Open Data Institute**: A non-profit promoting open data that just launched a data standardization research project 21 | - **National Information Exchange Model (NIEM)**: A repository of schemas and governing structure to support cross-domain data sharing. 22 | - **Open311**: An API standard for cities to support common citizen interactions such as pothole reporting. 23 | - **DATA Act**: A statute ordering the creation of a data standard and new publishing process to provide unprecedented auditability of Federal spending data. 24 | - **Data Coalition**: An organization which advocates for open data standards such as the DATA Act. 25 | - **Federal Geographic Data Committee (FGDC)**: A committee which maintains a standard for geographic data. 26 | 27 | ### Experts 28 | - **Mark Headd**: Current 18F Acquisitions Specialist, former CDO of Philadelphia 29 | - **Tyler Kleykamp**: CDO of Connecticut 30 | - **Andrew Nicklin**: Maintains datastandards.directory, involved with open311, openReferral, many city-level efforts. 31 | 32 | For each project interview, we asked: 33 | 1. What is ____, in your own words? 34 | 2. What was the impetus or driving force for this effort: policy, user needs, etc? 35 | 3. In building ____, what were the biggest challenges, and what went smoothly? 36 | 4. What tools and technologies do you use for this effort? 37 | 5. Why did you choose this architecture or process. Were others tried, etc? 38 | 6. What are the political and organizational dynamics of getting ____ setup? 39 | 7. Who were the relevant stakeholders for this project, how were they identified and convened? 40 | 8. Is there anyone else I should speak with to better understand ____? 41 | 42 | Expert interviews varied slightly by person, but we generally discussed what experience they have with similar efforts, what in their view went well, and what didn't go well. 43 | 44 | ### What potential users are saying 45 | 46 | In speaking with these individuals and reviewing open source repositories, a few things became clear. Although there is a lot of variety in these efforts, there is a lot of commonality as well, and everyone we spoke with was very supportive of trying to get more support for efforts like these moving forward. For example, every effort needs to develop some sort of schema for the data, provide a mechanism to validate against that schema, have it publicly documented, and provide a mechanism for data ingest. The more that can be done to support the infrastructure of these projects, at both the technical and procedural levels, the more it will enable federated data efforts in the future. Essentially every effort so far is starting from scratch in terms of doing user research, developing a schema, validating data, aggregating the data, getting approvals, building UIs, and developing processes to support the aggregation. The domain is ripe for investment into reusable tools and frameworks of thinking. 47 | 48 | One thing we learned is that several resources are undertaking research in this domain — namely, Andrew Nicklin is compiling data standards in datastandards.directory, and the Open Data Institute is undergoing a multi-year research effort into building and maintaining open data standards. Although there is a difference between data standards and federated data efforts (since efforts may be federated without adopting an open standard), it's encouraging that the domain is getting more attention. 49 | 50 | Another common theme is the confusion around ownership around these efforts. Several leaders interviewed see this work as public infrastructure for the 21st century, but it's not always clear who should be owning and maintaining these efforts. Although specific policies and laws can make roles clear in specific instances, overall the policies around data standards are confusing and ambiguous, leaving gaps that need to be filled in an ad-hoc manner. 51 | 52 | ## What we found so far 53 | The impetus for these types of efforts ranges from top-down policies (e.g., data.gov, code.gov, DATA Act), to 54 | private sector incentives (e.g., Google Transit Feed Specification), and passionate non-profits (e.g., OpenReferral, Open311). Regardless of the initial reasons for starting an initiative, it's important to realize that ultimately adoption is driven by cost / benefit analysis at the organizational level. In the case of mandates, legal or policy requirements are easily ignored or watered down if the cost is too high or benefit too low. It's also important to note that even if the organization stands to benefit significantly through compliance or cooperation, they may not have the capital (in time, money or expertise) to do so. This is a more salient problem when the organization is smaller and unlikely to already maintain technical staff, for example in local government or social services. 55 | 56 | In terms of incentives, it's critical that data aggregation efforts either (1) support a new "killer app" that supports organizations in their mission, or (2) supports the organization's existing workflow. In many interviews, the Google Transit Feed Specification, which allows governments to publish public transit data so that it can be consumed by Google Maps among others, was held up as the perfect example of the "killer app." Publishing in the spec seamlessly integrates the data into the apps that citizens are already using. Municipalities in Connecticut also embrace open data standards, not necessarily for it's value to the public, but because by publishing to a data portal, their own data can be more easily consumed within their own working processes. 57 | 58 | Once incentives are aligned, it's critical to get a few early adopters, and work with them to get something working end-to-end. The benefits of shipping early and iterating were salient in several projects, particularly code.gov, data.gov, and DATA Act. One thing that became clear (by both doing it right and doing it wrong), is that the distinction between policy and technology is extremely difficult to draw. In situations where the technology and policy were owned by different organizations, for example with the DATA Act, tensions arose and there was high communication overhead. In situations where the policies and technologies coevolved, such as with data.gov and code.gov, having both the technology and the policy able to nimbly react to user needs and implementation constraints proved extremely valuable. 59 | 60 | After early adopters prove success with the effort, there is generally a gold rush as organizations clamor to adopt & comply. In this stage, it's generally been fairly easy to keep momentum going, keep interest and funding, etc.. 61 | 62 | Then after a few years, the honeymoon ends, and organizations stare at years and years of effort to maintain compliance and evolve the schema in order to keep it relevant in a changing world. Organizations really struggle with maintaining standards, schemas, and tooling over the long term. For many of these efforts, it is still too soon to tell how the maintenance burden will be handled, but in other cases the loss of a single passionate person or 3rd party could spell doom for the effort. 63 | 64 | 65 | ## Proposal for Next Steps 66 | 67 | We believe the discovery work done so far, which has revealed a long history of, and growing demand for federated data efforts, is sufficient to establish a clear appetite for improved tooling for these efforts across many levels of government. There is also appetite for further research and exploration, however that important work is being actively pursued by non-profits and academia. TTS is in a unique position to invest in the technical infrastructure behind these efforts, and we believe that should be the focus of the next phase of feasibility testing. Work TTS has done so far, in particular work by DATA Act for data validation and ingest, and data.gov for JSON validation and harvesting, could form the basis of an ecosystem of tooling that could greatly accelerate future federated data efforts. Although previous work is open source, it is not engineered for public reuse. 68 | 69 | Building reusable tools is difficult because of a catch-22: it is difficult to build reusable tools without a forgiving first adopter, but in any new effort it will rarely be in budget or scope to extract reusable elements from existing solutions. We believe investment from 10x could help break that cycle: the next step should be to identify a new, upcoming federated data effort, and use it as a test case to generalize the work done by previous efforts. The work done in the past needs to be broken into smaller modules, published to package managers, and documented. It will likely also need to be supplemented with new tools or wrappers to provide a compelling product to potential adopters. It should also be noted that a previous 10x phase 1 project, the Generic Data Validation Platform, supported these same conclusions, and could be subsumed by this effort. 70 | 71 | One such upcoming federated data effort is the data standards policy itself. In a blissfully ironic twist, the federal standards policy (A-119), specifies that agencies should provide an inventory of the standards they use and justifications for government-unique standards, but it does not define a data standard or method of ingest for that inventory. We should work with OMB to define a standard, and work with the relevant stakeholder within GSA to be the first adopters of that standard, while building a set of tools to accept, document, and validate the new technology. If there are problems in getting buy-in from OMB on this particular issue, we can work with them to identify another one or, in the worst case scenario, proceed independently to refactor previous tooling based around scenarios from our case studies. 72 | 73 | Another goal of the next phase would be to help identify ways to clarify roles & responsibilities for data standards efforts at the federal level, to clarify authorities in this domain. 74 | 75 | ## Questions to be answered in the next phase 76 | 77 | There are a few key questions the next phase will seek to answer: 78 | 79 | 1. Is the work done to date in this area possible to make reusable (at the technical level)? 80 | 81 | Oftentimes, things may seem very similar at the procedural or process level, but have subtle differences which make it difficult to share actual code. It is still an outstanding question on whether it is possible to make code reusable for these efforts moving forward. 82 | 83 | 2. Can a new process / tool built with reusable components meet demands for usability and compliance? 84 | 85 | It's one thing to force a new domain to use a certain set of tools, it's another to use those tools and still be able to meet high standards for usability and compliance. 86 | 87 | 3. What role does the Federal government play in these efforts? 88 | 89 | Although specific laws and policies can make the roles clear in specific cases, does policy exist which delegates authority to any agency to pioneer or support these efforts in a more general way? 90 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Data Federation Project 2 | 3 | The U.S. Data Federation project promotes government-wide capacity-building to support distributed data management challenges, data interoperability, and broader data standards activities. The project is an initiative of the GSA Technology Transformation Services (TTS) [10x](10x.gsa.gov) program, which funds technology-focused ideas from federal employees with an aim to improve the experience all people have with our government.  4 | 5 | ## Overview 6 | 7 | U.S. government policies, initiatives, and public-facing products and services depend on aggregating and harmonizing data from disparate government sources. **The goal of the U.S. Data Federation project is to document repeatable processes, develop reusable tooling, and curate resources to support federated data projects.**  8 | 9 | We define a federated data project as an effort in which a common type of data is collected or exchanged across complex, disparate organizational boundaries. For example, federal agencies often need to collect data from state and local governments, other federal agencies, and other data providers. These federated data may be used to support policy or budget decisions, operational efficiencies, or published in aggregate form for other data users.  10 | 11 | Federated data efforts are increasingly seen as an engine for transparency, economic growth, and accountability, yet collecting this kind of data remains a challenge. While this type of data management effort is growing increasingly common in our distributed style of government, each new effort is still improvising solutions in terms of processes, tooling, and compliance infrastructure. **Many of these federated data efforts face common requirements and common challenges, but lack common resources.**  12 | 13 | The U.S. Data Federation project was conceived in 2016 to address this gap. The project set out to identify common challenges and pain points in federated data efforts and address these needs by curating best practices and resources and developing reusable tooling. The best practices and resources were intended to include guides and repeatable processes around data governance, organizational coordination, and standards development in federated environments. The reusable tools were intended to include capabilities around data validation, automated aggregation, and the development and documentation of data specifications.    14 | 15 | Over the course of its first three phases of 10x funding, it began to deliver on this ambition by building and launching ReVal, a Reusable Validation Library, which has been used by the USDA Food & Nutrition Services and other agencies to streamline data collection and validation processes. 16 | 17 | During Phase 4, the team took advantage of a unique opportunity to unite government-wide efforts to support open data and federated data efforts. The team has supported Data.gov, OMB, and OGIS stakeholders in developing a vision and delivering increased functionality for resources.data.gov, a legislatively-mandated online repository of policies, tools, case studies and other resources to support data governance, management, and use throughout the federal government. 18 | 19 | After conducting research with the stakeholders and audience for resources.data.gov, the team saw an opportunity for a long-term practical manifestation of the Data Federation as the content strategy team underpinning resources.data.gov. The future funding and organization of this work is currently under negotiation. 20 | 21 | ## Project milestones 22 | 23 | **Phase 1** (Fall 2017) 24 | 25 | *Team: Phil Ashlock, Anthony Garvan* 26 | - Interviewed a variety of distributed data management projects and synthesized findings in a [Data Federation Framework]().  27 | - Created a placeholder for future web content at federation.data.gov 28 | - Pitched for Phase 2 funding based on finding that reusable tooling and processes would benefit future federated data efforts 29 | 30 | **Phase 2** (Spring 2018) 31 | 32 | *Team: Phil Ashlock, Catherine Devlin, Anthony Garvan, Chris Goranson, Joe Krzystan* 33 | - Prototyped a reusable data validation tool that allows users to submit data via a web interface or API to be validated against a set of customizable rules in real time 34 | - Partnered with the USDA Food & Nutrition Service (FNS) to adapt this tool for the FNS-742, a form that collects verification data for the National School Lunch Program  35 | - Pitched for Phase 3 funding to further develop the tool, implement it with FNS, and conduct outreach to identify other partners and other opportunities for reusable tools  (Phase 2 Final Presentation) 36 | 37 | **Phase 3** (December 2018-June 2019) 38 | 39 | *Team: Phil Ashlock, Mike Gintz, Mark Headd, Ethan Heppner, Julia Lindpaintner, Amy Mok* 40 | - Developed Phase 2 prototype into Reusable Validation Library ([ReVal](https://github.com/18F/ReVAL)) with a focus on API-based usage 41 | - Worked with FNS to develop ReVal's first custom manifestation for FNS-742 as the [FNS Data Validation Service](https://github.com/18F/usda-fns-ingest) 42 | - Validated demand for ReVal and identified future partners  43 | - Continued to identify common needs and useful reusable resources for data efforts through outreach and presentations to the Data Exchange Community of Practice, Interagency Working Group on Open Data, VA Open Data Working Group, and others 44 | - Began building a community around a shared need for knowledge-sharing across data efforts in government 45 | - Protect against redundancy by aligning the efforts of the U.S. Data Federation with other efforts across government, such as the work of the Federal Data Strategy and the mandates of the Evidence Act and Open Government Data Act  46 | - Pitched for Phase 4 funding to leverage the completion of ReVal and the momentum of the U.S. Data Federation work to support a long-term vision and strategic plan for a user-centered, maximally-effective resources.data.gov 47 | 48 | **Phase 4** (October 2019-April 2020) 49 | 50 | *Team: Phil Ashlock, Mike Gintz, Julia Lindpaintner, Amy Mok, Princess Ojiaku, James Tranovich* 51 | - Collaborated with resources.data.gov stakeholders (GSA, OMB, OGIS) to identify likely audience and begin to define a long-term vision of success for the resource repository 52 | - Conducted interviews with over 30 people across 14 agencies, including 5 Chief Data Officers, data scientists, organizers of internal open data working groups, Federal Data Strategy detailees, and many others involved in their agency’s data governance or data management efforts 53 | - Outlined a long-term vision for the U.S. Data Federation as the content strategy team underpinning resources.data.gov and plans to prototype this approach during Phase 4 54 | - Reviewed all content and implemented new information architecture, navigation, and functionality in response to user research in order to make resources in the repository maximally discoverable 55 | - Prototyped the process of abstracting agency-specific resources to make them more broadly useful to other agencies 56 | 57 | ## References and deliverables 58 | 59 | **Phase 1** 60 | 61 | - [Interview notes](https://github.com/18F/data-federation-report/issues?utf8=%E2%9C%93&q=is%3Aissue+interview) 62 | - [Preliminary Findings](https://github.com/18F/data-federation-report/blob/master/PreliminaryFindings.md) 63 | - [US Data Federation Framework](https://github.com/18F/data-federation-report/blob/master/DataFederationFramework.md) (includes [Data Federation Maturity Model](https://github.com/18F/data-federation-report/blob/master/DataFederationFramework.md#the-data-federation-maturity-model) and [Data Federation Playbook](https://github.com/18F/data-federation-report/blob/master/DataFederationFramework.md#the-data-federation-playbook)) 64 | 65 | **Phase 2** 66 | 67 | - [Final presentation (PDF)](assets/US-Data-Federation-Phase-II-Final.pdf) 68 | - Prototype [Django Data Ingest Tool](https://github.com/18F/ReVAL)  69 | 70 | **Phase 3** 71 | 72 | - [Project overview (PDF)](assets/Project-Overview-for-Partners-Stakeholders.pdf) 73 | - [Project overview with FNS case study (PDF)](assets/Project-Overview-with-FNS-Case-Study.pdf) 74 | - [Project overview presentation (PDF)](assets/US-Data-Federation-Project-Intro.pdf) 75 | - [Webinar via Digital.gov on April 17, 2019](https://youtu.be/r4XUu2MLrDo) // [Slides (PDF)](assets/Digital.gov%20Presentation%20%E2%80%94%20US%20Data%20Federation.pdf) 76 | - [18F blog post](https://18f.gsa.gov/2019/03/05/the-us-data-federation/) on U.S. Data Federation project 77 | - [Federal Data Strategy Proof Point](https://strategy.data.gov/proof-points/2019/05/17/supercharging-data-through-validation-as-a-service/) on the USDA FNS Data Validation Service 78 | - [Phase 4 pitch presentation (PDF)](https://github.com/18F/data-federation-project/blob/master/assets/10x%20Data%20Federation%20Phase%204%20pitch.pdf) 79 | 80 | **Phase 4** 81 | *Forthcoming* 82 | 83 | **Biweekly updates** 84 | Starting in Phase 3, the team began publishing updates on its activities and progress roughly bi-weekly. All past updates can be found [here](https://github.com/18F/data-federation-project/tree/master/updates). 85 | 86 | ### Related repositories 87 | 88 | There are several repositories that contain code that is a part of this project. 89 | 90 | * Github repo for the US Data Federation website: https://github.com/GSA/us-data-federation 91 | * Github repo for ReVal, the US Data Federation's first reusable tool for data validation and aggregation: https://github.com/18F/ReVAL 92 | * Github repo for the FNS Data Validation Service, which uses ReVal to check submitted FNS-742 data against a set of centralized validation rules via API: https://github.com/18F/usda-fns-ingest 93 | * Github repo for the Workzone Data Exchange (WZDx) Validator, which uses ReVal to perform JSON Schema validation against the WZDx v1.1 specification: https://github.com/18F/usdot-jpo-ode-workzone-data-exchange 94 | 95 | *Other repos referenced:* 96 | 97 | * https://github.com/18F/couch-rules-engine 98 | * https://github.com/18F/goodtables-gov 99 | * https://github.com/18F/cx-cap-goal 100 | -------------------------------------------------------------------------------- /assets/10x Data Federation Phase 4 pitch.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/18F/data-federation-project/fe555e47c872dcf5adba9324b3e2eba37b4aed75/assets/10x Data Federation Phase 4 pitch.pdf -------------------------------------------------------------------------------- /assets/Digital.gov Presentation — US Data Federation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/18F/data-federation-project/fe555e47c872dcf5adba9324b3e2eba37b4aed75/assets/Digital.gov Presentation — US Data Federation.pdf -------------------------------------------------------------------------------- /assets/Project-Overview-for-Partners-Stakeholders.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/18F/data-federation-project/fe555e47c872dcf5adba9324b3e2eba37b4aed75/assets/Project-Overview-for-Partners-Stakeholders.pdf -------------------------------------------------------------------------------- /assets/Project-Overview-with-FNS-Case-Study.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/18F/data-federation-project/fe555e47c872dcf5adba9324b3e2eba37b4aed75/assets/Project-Overview-with-FNS-Case-Study.pdf -------------------------------------------------------------------------------- /assets/US-Data-Federation-Phase-II-Final.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/18F/data-federation-project/fe555e47c872dcf5adba9324b3e2eba37b4aed75/assets/US-Data-Federation-Phase-II-Final.pdf -------------------------------------------------------------------------------- /assets/US-Data-Federation-Project-Intro.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/18F/data-federation-project/fe555e47c872dcf5adba9324b3e2eba37b4aed75/assets/US-Data-Federation-Project-Intro.pdf -------------------------------------------------------------------------------- /summary.csv: -------------------------------------------------------------------------------- 1 | project_name, impetus, schema_tech, documentation_tech, aggregation_model, data_openness, ownership 2 | data.gov, federal policy, JSON Schema, static, federated, open, federal 3 | code.gov, federal policy, JSON Schema, static, federated, open, federal 4 | DATA Act, federal law, XBRL, static, website submission, open after processing, federal 5 | NIEM, federal, XML Schema, searchable, N/A, N/A, federal / unfunded 6 | CT State / Municipality, policy/law, policy/law, none?, email, some open, state 7 | Open311, non-profit + cities, Open API, static wiki, api, closed, non-profit 8 | OpenReferral, non-profit + social services, data package, readthedocs, peer-to-peer, closed, non-profit 9 | Google Transit Feed Spec, city + for-profit, data dictionary, static page, federated, open prior to aggregation, for-profit 10 | Geospatial, policy, "ISO 19115 (XML), JSON for data.gov", static page on ISO, federated, open, federal + standards body 11 | data.gov.ie, gov policy / law, dcat, static site and team, web / api upload / federated, open, gov 12 | Voting Information Project, user needs, CSVs / XML, readthedocs, web submission, open, Non-profit 13 | Open Civic Identifiers, analyst needs, CSVs, readthedocs, pull requests, open, contributors 14 | -------------------------------------------------------------------------------- /updates/README.md: -------------------------------------------------------------------------------- 1 | ## Phase 3 Project Updates 2 | 3 | Bi-weekly updates for work during Phase 3 of the data federation project. 4 | -------------------------------------------------------------------------------- /updates/update-01132020.md: -------------------------------------------------------------------------------- 1 | # January 13, 2020 2 | 3 | The 10x Data Federation team is getting ready to grow! In our first week back from the holidays, we prepared to welcome and onboard new team members starting next week. 4 | 5 | ## Completed 6 | 7 | * Prepared to onboard new team members: 8 | * Created a Project README 9 | * Wrote personal READMEs for current team members 10 | * Workstream 1: Resources.data.gov repository 11 | * Scheduled follow-up conversation with OMB stakeholders to address opportunities to collaborate with CDO Council and Federal Data Strategy 12 | * Connected with GSA’s Office of Shared Solutions and Performance Improvement to learn about GSA’s role in organizing the CDO Council 13 | * Workstream 2: ReVAL 14 | * Had intro conversation with contact at HUD to evaluate an additional possible use case for ReVAL 15 | * Reached out to FNS to tell us when to start scheduling calls with states 16 | 17 | ## Up next 18 | 19 | * Meet with OMB and other stakeholders to get feedback and align on next steps 20 | * Interview SSA CDO and former Data Exchange COP organizer Laura Train 21 | * Onboard Princess Ojiaku & (hopefully) new engineer 22 | * Define project tasks for remainder of Phase 4 and establish new team practices 23 | * Follow up with contacts at other agencies who mentioned possible resources for the repository to begin evaluating those 24 | 25 | ## Challenges / Blockers  26 | 27 | * Julia was in training 12h last week; Mike was attending additional strategy meetings 28 | * We are still waiting for an engineer to be staffed; ideally, they would start 1/21 29 | -------------------------------------------------------------------------------- /updates/update-01282020.md: -------------------------------------------------------------------------------- 1 | # January 28, 2020 2 | 3 | Having expanded our team, we are launching into our first sprint. We have defined four work streams and come up with overarching goals in each. We look forward to initiating work on a content audit of resources.data.gov, reinitiating work on ReVAL, and continuing conversations with potential partners on content for resources.data.gov. 4 | 5 | *Note: Now that we’re working in more defined sprints, we plan to publish shipping news after each sprint planning session. Since our sprints are running Wed-Tues, updates will likely come on Wed or Thurs in the future.* 6 | 7 | 8 | ## Completed 9 | 10 | * Onboarded new team members Princess Ojiaku (content strategy) and James Tranovich (engineering) and welcomed back Amy Mok and established an initial Team Charter 11 | * Got green light from all stakeholders to move forward with prototyping our proposed vision for the future of the U.S. Data Federation as the content strategy team underpinning resources.data.gov 12 | * Outreach and research 13 | * Interviewed SSA CDO and former Data Exchange COP organizer Laura Train 14 | * Participated in Data Exchange Community of Practice call 15 | * Followed up with potential content partners at FEMA and NARA Chief Records Office 16 | * Presented U.S. Data Federation in 10x Project Lightning Talks 17 | * Sprint Planning for Sprint 1! 18 | 19 | 20 | ## Up next 21 | 22 | * Resources.data.gov 23 | * Complete content audit for r.d.g (what's missing, what needs to be added) 24 | * Reduce scope of content sources to pursue during Phase 4 to 3-5 25 | * Introduce James to r.d.g backend / production processes 26 | * ReVAL 27 | * Push bug fixes to ReVAL and integrate fixes with FNS system 28 | * Future of U.S. Data Federation 29 | * Prepare a draft vision statement to share with 10x and data.gov 30 | * Project management 31 | * Draft milestones for remainder of Phase 4 to share with 10x and Phil Ashlock 32 | 33 | 34 | ## Challenges / Blockers  35 | 36 | * None at present! 37 | -------------------------------------------------------------------------------- /updates/update-02012019.md: -------------------------------------------------------------------------------- 1 | # February 1, 2019 2 | 3 | We're glad to be back! Reunited and it feels so good... 4 | 5 | ## Completed 6 | 7 | * Team re-group to review state of the project & consider next steps 8 | * Reconnected with partners at USDA FNS, in Montana, and in Kansas 9 | * Resumed outreach for additional use cases 10 | * Started a prototype to explore a new validation approach for data collection 11 | * Resumed edit checks testing 12 | 13 | ## Up Next 14 | 15 | * Sprint planning session for 2/4/19 16 | * Publish blog post on the 18F blog to promote the U.S. Data Federation and solicit additional use cases 17 | * Calls with 18 | * Colyar Technology Systems, Montana partners, and FNS (re: Montana integration of edit-checks) on 2/7/19 19 | * CMS on 2/27/19 to discuss possible use case 20 | * Gil Alterovitz on 2/6/19 to discuss possible VA use case 21 | * Brian Lee at CDC 22 | 23 | ## Questions / Blockers 24 | 25 | * Need approval from 10x team to publish blog post (target: early next week) 26 | -------------------------------------------------------------------------------- /updates/update-02122020.md: -------------------------------------------------------------------------------- 1 | # January 28, 2020 2 | 3 | Having expanded our team, we are launching into our first sprint. We have defined four work streams and come up with overarching goals in each. We look forward to initiating work on a content audit of resources.data.gov, reinitiating work on ReVAL, and continuing conversations with potential partners on content for resources.data.gov. 4 | 5 | *Note: Now that we’re working in more defined sprints, we plan to publish shipping news after each sprint planning session. Since our sprints are running Wed-Tues, updates will likely come on Wed or Thurs in the future.* 6 | 7 | 8 | ## Completed 9 | * **Resources.data.gov** 10 | * Completed content audit for r.d.g 11 | * Spoke with potential content sources and started to create criteria to reduce the number of partners we engage with 12 | * Assessed the EPA Metadata Editor tool and concluded that the Phase 4 team should not invest in abstracting it, but might be able to showcase it another way on r.d.g 13 | * Aaron Borden introduced team to r.d.g repository, structure, and processes 14 | * Conducted an initial survey of analogous websites to identify design patterns 15 | * **ReVAL** 16 | * Pushed bug fixes to ReVAL and pruned issue log 17 | * Released ReVAL v0.6.2 18 | * Prioritized remaining tasks for ReVAL to focus on clean up and rearchitecting, not developing the UI or other new features 19 | * **Future of U.S. Data Federation** 20 | * Prepared a draft vision statement and shared with 10x and Phil Ashlock for feedback 21 | 22 | 23 | 24 | 25 | ## Up next 26 | 27 | * **Resources.data.gov** 28 | * Define our (MVP) audience 29 | * Draft site map & content model 30 | * Start to develop content model for open source tools 31 | * **ReVAL** 32 | * Begin the process of refactoring ReVAL to pull the functionality from the UI into the API to establish ReVAL's "core" 33 | * Interview states who have implemented the FNS DVS to learn about challenges and impact of the service 34 | * **Future of U.S. Data Federation** 35 | * Identify key decision-makers for continued funding and build out proposal section of the vision statement with this audience in mind 36 | * Further articulate risk mitigation strategies in our risk tracker 37 | 38 | 39 | ## Challenges / Blockers  40 | 41 | * We have a lot of leads for folks to partner with on developing content, but we need to make sure that we are not spread too thin and can meaningfully test our hypothesis for a future content creation strategy 42 | -------------------------------------------------------------------------------- /updates/update-02152019.md: -------------------------------------------------------------------------------- 1 | # February 15, 2019 2 | 3 | We’ve completed our first sprint since getting back from the shutdown and have been busy pushing forward with our work with our partners at FNS and state agencies as well as working towards alignment on an overarching vision of success. 4 | 5 | ## Completed 6 | 7 | * Began collaborating with [Colyar](http://colyar.com) (one of the big vendors that states work with) 8 | * Conducted four [interviews with state departments](https://docs.google.com/spreadsheets/d/1pggM2KMBDyN-p_-fN9tzgsvHcA7UdqgSqlmnJADP6mY/edit#gid=0) of education to understand their FNS-742 process in order to further inform the development of our tool 9 | * Conducted goal-setting session facilitated by Elizabeth Ayer (Mural; summary) 10 | * Shared [learnings from user interviews](https://docs.google.com/document/d/1DLf52Fmg_JDUmQso7bKNY_W-RK9i81MbDSgJqy9oOu0/edit) in Kansas 11 | * Calls with potential use case contacts at VA, CMS, performance.gov 12 | * Worked with FNS to solve mismatches between their edit checks and our tool 13 | * Performed tests using data provided by FNS (from school year 2013 to 2017) to ensure tool’s results are correct 14 | * Planned [next sprint](https://github.com/18F/data-federation-project/projects/3) 15 | 16 | ## Up Next 17 | 18 | * Go through existing leads on use cases and identify a second use case for our Phase 3 work 19 | * Follow up with FNS, CDC, DOT, DOJ, Ed 20 | * Based on resources left in Phase 3, and based on previous goal-setting session, identify specific areas of delivery for Phase 3 21 | * Identify challenges / requirements for long term support & maintenance of data validation service for FNS 22 | * Continue to make improvement to the tool to serve the need for FNS use case 23 | 24 | ## Questions / Blockers 25 | 26 | * Waiting on publication of [blog post](https://docs.google.com/document/d/10V9RV8QxjfjGcropwY_39JNUV-vV6MXqccGYSsgeOn0/edit#heading=h.ygglxjnto8hk) on the 18F blog to promote the U.S. Data Federation and solicit additional use cases 27 | * Better understand of what Phase 4 10X looks like & process to get there. 28 | -------------------------------------------------------------------------------- /updates/update-02262020.md: -------------------------------------------------------------------------------- 1 | # February 26, 2020 2 | 3 | Sprint 2 was short, but sweet. We made meaningful progress defining an audience to target for our MVP, developing a potential taxonomy and site maps, identifying a content structure for open source software, refactoring ReVAL, and sketching out content development approaches. The highlight of the sprint was hearing from Kansas, Arizona, and Louisiana’s teams who had implemented the FNS Data Validation Service. 4 | 5 | *“I just wanna pass along some kudos: as far as I’m concerned, when the federal govt passes regulations, this should be the model… I just wanna say, hats off—this is the model, this is how it should be.” — State partner* 6 | 7 | 8 | ## Completed 9 | * **Resources.data.gov** 10 | * Defined our (MVP) audience as Chief Data Officers and their support staff 11 | * Consolidated previous “Jobs To Be Done” and potential content lists into the overarching resources.data.gov content inventory 12 | * Developed draft site maps and potential resource groupings 13 | * Identified a potential content structure for tools through an audit of open source project sites and tested it with ReVAL 14 | * Drafted reusability checklist for open source tools 15 | * High-level UX audit of current site to note opportunities for improvements 16 | * **ReVAL** 17 | * Began the process of refactoring ReVAL to pull the functionality from the UI into the API to establish ReVAL's "core" 18 | * Interviewed states who have implemented the FNS DVS and documented benefits described (time savings, data quality improvements, etc) and shared in #why-we-do-this 19 | * Presented to USDA’s Continuous Process Improvement Community of Practice on the FNS DVS, ReVAL, and the U.S. Data Federation project 20 | * **Future of U.S. Data Federation** 21 | * Continued to develop the proposal section of the vision statement for the U.S. Data Federation 22 | 23 | ## Up next 24 | 25 | * **Resources.data.gov** 26 | * Complete an MVP taxonomy for r.d.g 27 | * Complete MVP mockups of key page types for r.d.g 28 | * Confirm which resources we will develop during Phase 4 29 | * Assess feasibility of desired features/capabilities for r.d.g 30 | * **ReVAL** 31 | * Complete the refactoring work on ReVAL 32 | * Prototype a page for ReVAL on r.d.g 33 | * **Future of U.S. Data Federation** 34 | * Write a blog post describing the outcomes of the FNS DVS pilot 35 | * Identify key decision-makers for continued funding and build out proposal section of the vision statement with this audience in mind 36 | * Confirm whether part of the team will travel to Washington D.C. to take part in the next Data Exchange Community of Practice meeting and/or do working sessions with agency partners to develop resources 37 | * Schedule check-in with r.d.g stakeholders 38 | 39 | ## Challenges / Blockers  40 | 41 | * Waiting for feedback on Vision Statement 42 | * Communication from potential partners on resource development has not been forthcoming 43 | 44 | -------------------------------------------------------------------------------- /updates/update-03042019.md: -------------------------------------------------------------------------------- 1 | # March 4, 2019 2 | 3 | Things are buzzing! We have organized our work into 5 workstreams and are moving forward on all fronts, sharing the U.S. Data Federation through presentations, talking to potential partners, and laying the groundwork to set our partner FNS up for success in the long-term. 4 | 5 | ## Completed 6 | 7 | * Defined [5 workstreams](https://docs.google.com/document/d/1dYwGniGYRSqtxQPLU1TXGpzplgSXU54aRU5n5w9RZ-U/edit#heading=h.tlxx0bytq2ml) for our work for Phase 3 8 | * Presented the U.S. Data Federation at the OPP All Hands 9 | * Had promising conversations with CMS and DOT about potential collaboration 10 | * Began to investigate long-term hosting options for FNS’s Data Validation Service and started a conversation around ATO process and other requirements 11 | * Outlined different long-term maintenance/hosting options 12 | * Got cloud.gov pricing 13 | * Call with FNS 14 | * Started to explore the process of becoming a shared service 15 | * Got initial thoughts/advice from Login 16 | * Started a conversation with Federalist 17 | * Set up FNS’s Data Validation Service with continuous integration 18 | * Included a framework to make testing easier 19 | * Planned for our [next sprint](https://github.com/18F/data-federation-project/projects/3), which will focus on identifying 2-3 finalist contenders as our second use case 20 | 21 | ## Up Next 22 | 23 | * Publish blog post on the 18F blog to promote the U.S. Data Federation and solicit additional use cases 24 | * Present to the Interagency Open Data Working Group on Tuesday, March 5 25 | * Follow up with leads on additional use cases 26 | * Talk to Federal Data Strategy 27 | * Map out steps required to move FNS validation service to production, including any required ATO. 28 | 29 | 30 | ## Questions / Blockers 31 | 32 | * Still working to fill [two ½ positions](https://github.com/18F/staffing-and-resources/issues/517#issuecomment-466468054) for the team 33 | 34 | 35 | -------------------------------------------------------------------------------- /updates/update-03152019.md: -------------------------------------------------------------------------------- 1 | # March 15, 2019 2 | 3 | The U.S. Data Federation is making waves! Our presentation to the Interagency Working Group on Open Data and the publication of our blog post resulted in a wonderful influx of interest and potential opportunities for collaboration. Meanwhile we continue to build out and improve upon the tool we’re building for the Food & Nutrition Service (FNS) at USDA. 4 | 5 | ## Completed 6 | 7 | * Welcomed Carter to the team! 🎉 8 | * Published [blog post](https://18f.gsa.gov/2019/03/05/the-us-data-federation/) on the 18F blog to promote the U.S. Data Federation and solicit additional use cases 9 | * Presented to the Interagency Open Data Working Group on Tuesday, March 5 10 | * Followed up with leads on additional use cases and established contact with several more leads following outreach after the presentation, including FEMA, Census, performance.gov, and the CX survey team 11 | * Identified the authorizing official at USDA and initiated ATO conversation, discussed possible long-term hosting and maintenance approaches as well as cloud.gov pricing with FNS 12 | * Created table schema to allow data type validation for FNS Data Validation Service 13 | * Allowed random column orders in input files for the Django-Data-Ingest tool 14 | * Began work on variable substitution for FNS Data Validation Service 15 | * Drafted an [initial feature set and product vision statement](https://docs.google.com/document/d/1ViEh4VhfmZBzMz5QTt-BOjq0V-HpDuAvngdo_Wb8OtQ/edit#) for the “generic” Django-data-ingest tool 16 | * Got some press on [FedScoop](fedscoop.com/18f-us-data-federation-project/) & booked a presentation on digital.gov for April 17 | 18 | ## Up Next 19 | 20 | * Calls with 21 | * Federal Data Strategy 22 | * VA Data Commons 23 | * Work Zone Data Exchange team at USDOT 24 | * Census Commodities Tracking Survey 25 | * 10x team 26 | * Tech Modernization Fund 27 | * Updates from Colyar (Montana system) on integrating the tool in their product 28 | * Sprint planning on Monday, March 18 29 | 30 | ## Questions / Blockers 31 | 32 | * Waiting on additional strategy/product resource 33 | * Timeline for Phase 4 funding unclear 34 | -------------------------------------------------------------------------------- /updates/update-03162020.md: -------------------------------------------------------------------------------- 1 | # March 16, 2020 2 | 3 | The team is steadily making progress towards an MVP for r.d.g, having completed an initial taxonomy, constructed wireframes, and begun to investigate the technical capabilities and limitations of the current site. While travel plans are on hold due to coronavirus concerns, we are moving forward with content development with a GSA colleague working on one of the Federal Data Strategy Action Items. 4 | 5 | ## Completed 6 | * **Resources.data.gov** 7 | * Completed an MVP taxonomy for r.d.g 8 | * Completed MVP wireframes of key page types for r.d.g 9 | * Narrowed down the contenders for resources we will develop during Phase 4: 10 | * Case studies and job descriptions generated by Federal Data Strategy Action 13 11 | * Model agreement for data exchanges in collaboration with 10x, SSA, DX-COP, and others 12 | * Data standards agreement Data governance work at FEMA 13 | * Data governance council meeting cheat sheet 14 | * Assessed feasibility of desired features/capabilities for r.d.g 15 | inventory 16 | * **ReVAL** 17 | * Finishing up the refactoring work on ReVAL 18 | * Prototyped a page for ReVAL on r.d.g 19 | * **Future of U.S. Data Federation** 20 | * Wrote a draft blog post describing the outcomes of the FNS DVS pilot 21 | * Identified some of the key decision-makers for continued funding 22 | * Confirmed that team will not travel for DXCOP meeting; still evaluating ability to travel to the CDO Council meeting the following week 23 | * Scheduled check-in with r.d.g stakeholders for March 17 24 | 25 | ## Up next 26 | 27 | * **Resources.data.gov** 28 | * Test the taxonomy with users 29 | * Content development 30 | * Find alternatives to developing content with Mikhael at FEMA 31 | * Follow-up conversation with Bethany Blakey 32 | * Develop clear structure for files and navigation of r.d.g 33 | * Decide whether to build on top of current r.d.g or start fresh 34 | * **ReVAL** 35 | * Merge final PRs for ReVAL refactoring 36 | * Make updates to documentation as needed 37 | * Rename ReVAL: take out “Aggregation” and make renaming changes on GitHub 38 | * **Future of U.S. Data Federation** 39 | * Meet with r.d.g stakeholders 40 | * Will develop the proposal section of the vision statement for the U.S. Data Federation 41 | 42 | ## Challenges / Blockers  43 | 44 | * Travel restrictions require us to remotely co-work with partners 45 | -------------------------------------------------------------------------------- /updates/update-03252020.md: -------------------------------------------------------------------------------- 1 | # March 25, 2020 2 | 3 | ReVal is done! Huge thanks and kudos to James Tranovich for getting this over the finish line. The reusable validation tool is now refactored and the documentation has been built out to provide more examples and support to future users. Next step: Get that beauty published on resources.data.gov! 4 | 5 | ## Completed 6 | * **Resources.data.gov** 7 | * Sent a taxonomy activity to r.d.g stakeholders and DXCOP 8 | * Found alternatives to FEMA content and continued conversations with Bethany Blakey, Eric Ewing, and Laura Train 9 | * Decided to build the “new” r.d.g on top of the existing site rather than starting from scratch 10 | * Created high-level mockups to explore design approaches to the homepage, resource lists, and resource pages for r.d.g 11 | inventory 12 | * **ReVAL** 13 | * Merged final PRs for ReVal refactoring 14 | * Completed documentation 15 | * Renamed ReVal to remove the word “Aggregation” 16 | * Got approval from states to be named in the blog post about FNS’ success 17 | * **Future of U.S. Data Federation** 18 | * Met with r.d.g stakeholders and shared updates on our work since December 19 | * Developed the proposal section of the vision statement 20 | * Shared vision statement with 10x team, r.d.g stakeholders, and others within GSA 21 | 22 | ## Up next 23 | 24 | * **Resources.data.gov** 25 | * Finalize structure for files and navigation of r.d.g 26 | * Explore using Airtable/Google Sheet to author content and import to GitHub 27 | * Make a decision about whether we will use Netlify CMS on r.d.g 28 | * Synthesize user feedback on taxonomy & IA 29 | * Create MVP content model 30 | * Figure out how much of existing r.d.g to strip back for MVP build 31 | * Continue to develop mockups 32 | * Evaluate likelihood of new content: 33 | * CDO Charter (w/ Eric Ewing) 34 | * DGB Strategy (w/ Laura Train) 35 | * Data Skills Profiles (w/ NLM) 36 | * **Future of U.S. Data Federation** 37 | * Formalize vision statement into a proposal with funding section 38 | * Get clarity on approval processes around resources 39 | 40 | 41 | ## Challenges / Blockers  42 | The state of the world is affecting everyone. We are trying to move forward on the project work—which is so far unaffected—with patience, kindness, and understanding for our partners’ shifting priorities and our own. 43 | -------------------------------------------------------------------------------- /updates/update-04012019.md: -------------------------------------------------------------------------------- 1 | # April 1, 2019 2 | 3 | Our team is growing and our vision is becoming ever clearer. This sprint, we’ll take a step back to define an MVP for our second use case, set parameters around our outreach efforts, and try to answer that all-important question: What is the thing?* 4 | 5 | ## Completed 6 | 7 | * Welcomed Mike to the team! 🎉 8 | * Website update recommendations 9 | * Calls with 10 | * Federal Data Strategy, to align on how our work can/should relate 11 | * Technology Modernization Fund, to establish whether that is a viable source of funding for parts of our project 12 | * 10x team, to understand more about Phase 4 funding / pitch planning 13 | * Past and potential collaborators: Frictionless Data, FGDC, Justin Stekervetz (NIEM, DATA Act), 18F Biz Dev, Open Referral 14 | * Eligibility Rules Service team, to compare notes on business model/long-term planning 15 | * Follow-up conversations & technical investigation for 2 candidates for our “2nd use case” 16 | * Census: Commodity Flow Survey — use our tool to validate, aggregate, and potentially automatically code input from users 17 | * DOT: Work Zone Data Exchange — use our tool to validate adherence to a new specification for work zone data 18 | * Continued work on variable substitution for FNS Data Validation Service 19 | * Currently supporting basic arithmetic operations 20 | * Started work on creating a project template to make it easier for new users to start a new data ingest project 21 | * Dockerized the project template 22 | 23 | 24 | ## Up Next 25 | 26 | * Engage with Federal Data Strategy to follow up with their identified proof points 27 | * Define the MVP for demonstrating reusability of our tool 28 | * Naming session for the “Django Data Ingest” 29 | * Continue conversation around ATO and long-term planning with FNS 30 | * Implement updates to website 31 | * Continue work on variable substitution for FNS Data Validation Service to allow chosen precision display 32 | * Continue work on the project template 33 | * Explore implementing a base validator that can make new customized validator easier to create by user 34 | * Continue with technical investigation for 2nd use case 35 | 36 | 37 | ## Questions / Blockers 38 | 39 | * Little communication from states as to their expected timeline or challenges implementing the FNS Data Validation Service; their progress will significantly impact the level of support FNS will require beyond Phase 3 40 | 41 | *The thing we want to define is what the reusable, generally-accessible version of the Django data ingest tool is—what it does, who uses it, and where it lives. 42 | -------------------------------------------------------------------------------- /updates/update-04072020.md: -------------------------------------------------------------------------------- 1 | # April 7, 2020 2 | 3 | We are heading into our second-to-last sprint as a fully-staffed team and are looking forward to seeing a lot of work come to fruition over the next month. The team’s focus is finalizing the content, structure, and design for the new release of resources.data.gov (r.d.g) as well as continuing to pursue long-term support for the work. 4 | 5 | ## Completed 6 | * **Resources.data.gov** 7 | * Tested categories and tagging structure with all the current resources in Mural in order to evaluate our taxonomy and navigation and inform structure and design 8 | * Recommended resources for deletion from r.d.g 9 | * Created a fork of resources.data.gov in the 18F Github for prototyping purposes 10 | * Successfully tested three approaches to authoring content for the repository; made preliminary decision to go with the 3rd approach for our initial import of resources and then use Netlify going forward 11 | * By spreadsheet that is imported into the repo directly to create individual pages per resource 12 | * By using Netlify as a CMS to create new resources in r.d.g 13 | * By using a script to import records from the spreadsheet into Netlify 14 | * Finalized MVP content model for resource summaries and individual resources 15 | * Received and synthesized 8 responses to our taxonomy card-sort activity 16 | * Determined the following as the content to be added in the new release of r.d.g: 17 | * CDO Charter Template (WIP in collaboration with the CoEs) 18 | * Federal Data Strategy Proof Points 19 | * SSA Data Governance Board Example 20 | * **Future of U.S. Data Federation** 21 | * Received extremely helpful feedback and pricing information from Jay Huie which has begun to transform the vision statement into an official proposal 22 | * Got clarity on communications approval processes around resources by speaking with Dahianna Salazar-Foreman and Krista Britt — some of the takeaways: 23 | * Linking to an already published, publicly-accessible government-created resource is fine 24 | * We should be careful not to suggest that any “how-to” or recommendation could be construed as official guidance—the repository is surfacing options rather than prescribing a course of action 25 | * If publishing a resource created by another agency, ensure that that agency’s comms team has signed off and given approval. 26 | 27 | 28 | ## Up next 29 | 30 | * **Resources.data.gov** 31 | * Finalize structure for files and navigation of r.d.g and get feedback/validation (from Data.gov / r.d.g stakeholders) 32 | * Define page/content structures for r.d.g (to begin coding) 33 | * Draft website copy 34 | * Finalize MVP styles / design 35 | * Understand & initiate approval process for website copy and design changes 36 | * Finalize CDO Charter Template with Eric Ewing 37 | * Refine list of resources that could be pursued by future teams 38 | * Update r.d.g stakeholders 39 | * **Future of U.S. Data Federation** 40 | * Finalize proposal 41 | * Formulate specific plan for immediate post-10x work 42 | * Identify the steps towards proposal approval and resource allocation 43 | 44 | ## Challenges / Blockers  45 | The coronavirus crisis continues to affect our ability to get feedback and conduct user research as stakeholders and partners are forced to reprioritize. We feel limited in our ability to secure a future for this work without more input and situational awareness around planning, budgeting, and the realities of resourcing. 46 | -------------------------------------------------------------------------------- /updates/update-04152019.md: -------------------------------------------------------------------------------- 1 | # April 15, 2019 2 | 3 | With four sprints left, we are working to ensure the future viability of each part of the U.S. Data Federation project. That means doing research and building alignment around a strong case for Phase 4 funding, engaging with additional partners strategically, transitioning FNS to a new agreement, and continuing to upgrade our tool. 4 | ## Completed 5 | 6 | * Defined requirements for satisfying our “second use case” 7 | * Defined our desired terms of engagement with two additional partners 8 | * DOT: Work Zone Data Exchange — We will use this case to prove out the reusability of our tool by upgrading to allow for schema validation 9 | * Census: Commodity Flow Survey — We will provide support to an internal developer who will build a prototype as a way to learn what support and documentation is needed 10 | * Created interview template to guide conversations with Federal Data Strategy proof point subjects and other use cases 11 | * Updated [website](https://federation.data.gov/) to align messaging across platforms 12 | * Aligned with Eligibility Rules Service team on Phase 4 pitch planning 13 | * Clarified phase 4 pitch expectations with 10x team 14 | * Finished work on variable substitution for FNS Data Validation Service to allow chosen precision display 15 | * Started work on modularizing Validator in Django data ingest to allow for easier subclassing 16 | * Generated ideas for a new name for the reusable [Django data ingest](https://github.com/18F/django-data-ingest) module 17 | 18 | 19 | ## Up Next 20 | 21 | * Schedule interviews with previous contacts and proof point partners 22 | * Circulate a few possible name directions for consideration 23 | * Align on and share a cost estimate with FNS for work with 18F 24 | * Continue work on modularizing Validator in Django data ingest to allow for easier subclassing 25 | * Focus on documenting in preparation of the developer’s guide for FNS 26 | * Start work on creating a JSON Schema Validator 27 | * Begin supporting Census partner in the creation of a prototype 28 | 29 | 30 | ## Questions / Blockers 31 | 32 | * Sporadic and unreliable communication from states working on the implementation of the USDA/FNS Data Validation Service 33 | -------------------------------------------------------------------------------- /updates/update-04232020.md: -------------------------------------------------------------------------------- 1 | # April 23, 2020 2 | 3 | It’s the final countdown! We are into our last sprint and are all working to ensure that content, design, and development are in lockstep with one another so that we can meet our target of delivering a ready-to-relaunch resources.data.gov at the end of this project! We continue to nudge the conversation about eventual home and funding for this work forward as best we can. 4 | 5 | ## Completed 6 | * **Resources.data.gov** 7 | * Met with the data.gov team to discuss the process for relaunching r.d.g 8 | * Finalized structure for files and navigation of r.d.g and met with Hyon and Phil for review 9 | * Defined page/content structures for r.d.g 10 | * Drafted website copy (resource descriptions, category descriptions, and other copy) 11 | * Began implementing new designs 12 | * Reached out to stakeholders to schedule a final review of proposed changes to r.d.g pre-relaunch 13 | * Finalized the OCDO, Data Governance Steering Committee, and Data Governance Advisory Group Charter Templates with Eric Ewing 14 | * Began refining the list of resources that could be pursued by future teams 15 | 16 | * **Future of U.S. Data Federation** 17 | * Continued the conversation with Ken Ambrose, CDO Council coordinator to ensure future collaboration between the Data Federation and the Council 18 | * Met with Dominic Sale, Jay Huie, Phil Ashlock, and the 10x team to discuss funding for this project in the future 19 | 20 | 21 | 22 | ## Up next 23 | 24 | * **Resources.data.gov** 25 | * Design and development work 26 | * Finalize design for site; implement changes 27 | * Final pull request to prepare for project release 28 | * Conduct user feedback / re-engagement sessions 29 | * QA/Testing: Mobile, cross-browser, A11y testing 30 | * Content 31 | * Finalize descriptions for all content on r.d.g 32 | * Move all the resources onto r.d.g using Netlify (Proof Points, etc) 33 | * Finish the change log for r.d.g 34 | * Offboarding / Documentation 35 | * Document outstanding development issues 36 | * Finish backlog of content for r.d.g 37 | * Document completed / recommended user research/usability testing 38 | * Build out offboarding materials (contextualize work, describe next steps, consolidate links) 39 | * Plan/prep outreach materials (blog, webinar, etc) 40 | 41 | * **Future of U.S. Data Federation** 42 | * Present r.d.g changes to stakeholders and get sign-off 43 | * Follow up on conversation with Dominic Sale 44 | * Plan/prep final presentation 45 | 46 | 47 | ## Challenges / Blockers  48 | Our only challenges/blockers are: 49 | * our lack of experience navigating the funding options that would ensure the continuation of this work 50 | * the dwindling time we have to devote to this project 51 | 52 | 53 | -------------------------------------------------------------------------------- /updates/update-04292019.md: -------------------------------------------------------------------------------- 1 | # April 29, 2019 2 | 3 | We’re doubling down on research this sprint to inform our Phase 4 pitch, while continuing to move our partners at FNS towards ownership of their instance of our reusable tool. 4 | 5 | ## Completed 6 | 7 | * Held a [Digital.gov webinar](https://digital.gov/event/2019/04/17/an-introduction-us-data-federation/) to spread the word and re-engage previous contacts 8 | * Clarified expectations with FNS partner 9 | * Discussed what a transition to a 18F engagement would look like 10 | * Aligned on priorities for our work during the remainder of Phase 3 11 | * Scheduled six interviews for the coming sprint with Federal Data Strategy Proof Point partners, previous contacts, and leads generated by webinar 12 | * Voted on [potential name ideas](https://app.mural.co/t/gsa6/m/gsa6/1553199111316/0aa2b5b5b715f9f370eea74dd3f4b33f2ee9dec0) for the Django data ingest toll; top contenders include MAVEN (Multi-Use Aggregation & Validation Engine) and DAVE (Data Aggregation & Validation Engine) 13 | * Continued work on modularizing Validator in Django data ingest to allow for easier subclassing 14 | * Began work on creating a JSON Schema Validator 15 | * Created a [prototype for WZDx schema validation](https://docs.google.com/document/d/1S22yNfIu95qsseBjP5xs9xQpHztXcIBdDdfwV5rrr1A/edit#) 16 | 17 | 18 | ## Up Next 19 | 20 | * Conduct interviews, collect & synthesize notes 21 | * Schedule 10x Phase 4 pitch presentation & create first draft 22 | * Continue to learn about various sustainment models for products within TTS 23 | * Identify any past points of contact who could use the Django Data Ingest Tool 24 | * Circulate a few possible name directions for consideration 25 | * Prepare to transition FNS Data Validation Service to their team at the end of Phase 3 26 | * Learn more about cloud hosting options 27 | * Provide a summary of expected support needs for their PWS 28 | * Sketch out a roadmap for the service from current prototype state to production-ready state 29 | * Continue to work on JSON Schema Validator 30 | * Update validator subclassing work based on work being done on JSON Schema Validator 31 | 32 | 33 | ## Questions / Blockers 34 | 35 | None at present! 36 | -------------------------------------------------------------------------------- /updates/update-05052020.md: -------------------------------------------------------------------------------- 1 | # May 5, 2020 2 | 3 | Friday marked the end of our :1team1dream: of four working on the project. Endless thanks to Princess Ojiaku and James Tranovich for their dedication to the project and for being fantastic team members! Mike Gintz and Julia Lindpaintner will spend the next couple weeks tying up a few loose ends, closing out/transitioning the project, and overseeing the relaunch of resources.data.gov. 4 | 5 | ## Completed 6 | * **Resources.data.gov** 7 | * Design & development 8 | * Finalized design for site; implemented changes 9 | * Opened final pull request against the GSA/resources.data.gov repository to prepare for project release 10 | * Conducted user feedback / re-engagement sessions 11 | * Conducted limited mobile, cross-browser, and automated a11y testing 12 | * Added a new “Data standards” category in anticipation of fulfilling Action 20 in the Federal Data Strategy 13 | * Content 14 | * Finalized descriptions for all content on r.d.g 15 | * Moved all the resources onto r.d.g using Netlify (Proof Points, etc) 16 | * Finished the change log for r.d.g and shared with stakeholders 17 | * Offboarding / Documentation 18 | * Documented outstanding technical/development issues 19 | * Documented content-related work 20 | * Began building out offboarding materials (contextualized work, described next steps, consolidated links) 21 | * **Future of U.S. Data Federation** 22 | * Presented r.d.g changes to stakeholders via live staging site 23 | * Submitted proposal for “mid-year sweeps” funding through OGP 24 | * Investigated other potential funding models, including fee-for-service and “pass-the-hat” 25 | * Scheduled follow up on conversation with Dominic SaleFederation and the Council 26 | 27 | ## Up next 28 | 29 | * **Resources.data.gov** 30 | * Final review of site: Content, a11y testing 31 | * Finalize content backlog and share with Ken Ambrose 32 | * Share charter templates with Ken Ambrose for dissemination among CDOs 33 | * Relaunch the site! 34 | 35 | * **Future of U.S. Data Federation / Project management** 36 | * Prep outreach materials (blog, webinar, etc) 37 | * Conduct follow up meeting with Dominic Sale 38 | * Plan/prep final presentation 39 | * Share project with 18F at Team Coffee 40 | 41 | 42 | ## Challenges / Blockers  43 | * Waiting to hear back from our OMB stakeholders with regards to timing around relaunching the site! 44 | 45 | 46 | -------------------------------------------------------------------------------- /updates/update-05132019.md: -------------------------------------------------------------------------------- 1 | # May 13, 2019 2 | 3 | Our top priority is ensuring that the work done in Phases 1-3 of the 10x Data Federation project show maximum ROI. To that end we’re working hard to ensure that every workstream has concrete targets and deliverables — whether or not we secure Phase 4 funding. 4 | 5 | ## Completed 6 | 7 | * Tentatively scheduled 10x Phase 4 pitch presentation for May 30, 2019 8 | * Held team alignment sessions around vision for Phase 4 9 | * Continued user research, landscape analysis, and outreach — had calls with: 10 | * FEMA Data Governance Board 11 | * Mapping Medicare Disparities Tool at CMS 12 | * Work Zone Data Exchange specification at DoT 13 | * Puerto Rico Address Working Group at Census 14 | * The Opportunity Project at Census 15 | * VA Open Data Page 16 | * Representatives of National Information Exchange Model (NIEM) from DHS, HHS, DoD 17 | * Team at SEMOSS (Semantic Open Source Software) 18 | * Federalist 19 | * Reprioritized our Phase 3 goals; specified deliverables for each workstream 20 | * Circulated a few possible names for the Django Data Ingest Tool for reactions 21 | * Provided a summary of maintenance and support needs to FNS to inform their PWS 22 | 23 | ## Up Next 24 | 25 | * Create first draft of Phase 4 pitch presentation 26 | * Continue to develop the reusable tool 27 | * Continue validator subclassing 28 | * Update validator subclassing work based on work being done on JSON Schema Validator 29 | * Add to documentation on how to use the validator 30 | * Prepare to transition FNS Data Validation Service to their team at the end of Phase 3 31 | * Evaluate our tool against to the Before You Ship security checklist 32 | * Review documentation on migration to new cloud.gov account 33 | * Outline for developers’ guide 34 | * Synthesize interview notes and square learnings against previous framework for federated data efforts 35 | * Continue conversations with CMS, NAVAIR, VA Data owners 36 | * Start mailing list for those who have expressed interest in being updated on the work of the U.S. Data Federation 37 | * Present to VA Open Data Working Group 5/23 and Data Exchange CoP 5/24 38 | 39 | ## Questions / Blockers 40 | 41 | None at present! 42 | -------------------------------------------------------------------------------- /updates/update-05282019.md: -------------------------------------------------------------------------------- 1 | # May 28, 2019 2 | 3 | We’re headed into our last full sprint focused on the pitch for Phase 4, wrapping deliverables for Phase 3, and setting FNS and our other partners up for success as they move forward. 4 | ## Completed 5 | 6 | * Collaborated on first draft of Phase 4 pitch presentation 7 | * Articulated value proposition 8 | * Defined Phase 4 vision 9 | * Continued to develop the reusable data validation tool 10 | * Got feedback from our partner at the Census Bureau on implementation 11 | * Created documentation on API responses, error codes, cloud.gov deployment 12 | * Prepared to transition FNS Data Validation Service to the USDA/FNS team at the end of Phase 3 13 | * Evaluated our tool against to the [Before You Ship](https://before-you-ship.18f.gov/) security checklist and cloud.gov production application guidance 14 | * Met with security leads for guidance on handoff to FNS 15 | * Prepared outline for developers’ guide and shared with FNS 16 | * Started on creating content for developers’ guide 17 | * Presented to VA Open Data Working Group 5/23, Data Exchange CoP 5/24, and Puerto Rico Address Data Working Group 5/28 18 | 19 | 20 | ## Up Next 21 | 22 | * Finalize Phase 4 pitch 23 | * Feedback during cross-pollination session May 30 24 | * Feedback from 10x team 25 | * Complete MOU for potential Phase 4 26 | * Phase 4 pitch on Thursday, June 6 at 12:30 EST 27 | * Guests: Janis Johnston & Ed Harper (FNS), Christian Moscardi (Census) 28 | * Wrap up Phase 3 29 | * Close out documentation on the validator 30 | * Add issues for outstanding work on reusable tool 31 | * Close out prototype for WZDx 32 | * Final check in with Christian Moscardi at Census 33 | * Consolidate Phase 3 research process and findings in a single document 34 | * Start mailing list for those who have expressed interest in being updated on the work of the U.S. Data Federation 35 | * Develop final deliverable for 10x 36 | * Hand off to FNS 37 | * Fill out Developer’s Guide in two sections (for states; for developers) 38 | * Meet with FNS to discuss responsibilities and get an update on states’ progress 39 | 40 | 41 | ## Questions / Blockers 42 | 43 | * Template for 10x deliverable forthcoming 44 | * Some team members are OOO 45 | 46 | -------------------------------------------------------------------------------- /updates/update-06102019.md: -------------------------------------------------------------------------------- 1 | # June 10, 2019 2 | 3 | Our Phase 4 pitch has been rescheduled and is set for this Thursday, June 13 at 12:30pm EST / 11:30am CST / 9:30am PST. We are looking forward to making our case for the continuation of this important and interesting work. 4 | 5 | ## Completed 6 | 7 | * Renamed the Django data ingest—meet ReVAL, the Reusable Validation & Aggregation Library! 8 | * Finalized [Phase 4 pitch](https://docs.google.com/presentation/d/1v_nTMcyEhQvrI1pv79YbNAx6bCf6NiXuCptRq2OjzbI/edit?ts=5cfe7855#slide=id.g5ada30f8b8_1_1107) 9 | * Solicited feedback in cross-pollination session May 30 10 | * Solicited feedback from design critique group 11 | * Completed [first draft of MOU](https://docs.google.com/document/d/1OdUExyb8VMDjRK9tDcn4eGTiraPQSXwn62GnTwXmI3k/edit) for potential Phase 4 12 | * Continued to develop the reusable data validation tool 13 | * Started closing out documentation 14 | * Started adding issues for outstanding work on reusable tool 15 | * Started to close out the prototype for WZDx 16 | * Phase 3 wrap-up 17 | * Final check in with Christian Moscardi at Census and Nate Deshmukh-Towery with the Work Zone Data Exchange at DOT 18 | * Began consolidating Phase 3 activities and findings in a single document 19 | * Started a mailing list for those who have expressed interest in being updated on the work of the U.S. Data Federation and sent out first message 20 | * Prepared to transition FNS Data Validation Service to their USDA/FNS team at the end of Phase 3 21 | * Created content for [developers’ guide on GitHub](https://github.com/18F/usda-fns-ingest/wiki) in two sections 22 | * Met with FNS to discuss responsibilities and get an update on states’ progress 23 | * Submitted developer’s guide for comments from dev lab, received feedback, made updates 24 | 25 | ## Up Next 26 | 27 | * Deliver Phase 4 pitch Thursday, June 13 28 | * Complete MOU for potential Phase 4 29 | * Complete Phase 3 summary as deliverable for 10x team 30 | * Amy to participate in webinar introducing states to the USDA/FNS Data Validation Service 31 | * Complete documentation on the prototype for WZDx 32 | * Ping down all dependencies for the reusable tool 33 | * Complete adding issues for outstanding work on reusable tool 34 | * Complete closing out the documentation for the reusable tool 35 | 36 | ## Questions / Blockers 37 | 38 | * Some team members have started working on other projects 39 | * @juliaklindpaintner will be OOO 6/17–7/1 40 | 41 | 42 | -------------------------------------------------------------------------------- /updates/update-07082019.md: -------------------------------------------------------------------------------- 1 | # July 8, 2019 2 | 3 | A few weeks have past since our last ship, but we have some exciting news: **the 10x U.S. Data Federation project has been approved for Phase 4 funding!** We are now focused on some refined scoping to prepare for a Phase 4 that completes known enhancements to ReVAL and brings the work of the Data Federation to bear on the creation of [resources.data.gov](resources.data.gov). 4 | 5 | ## Completed 6 | 7 | * [Pitched for Phase 4](https://docs.google.com/presentation/d/1v_nTMcyEhQvrI1pv79YbNAx6bCf6NiXuCptRq2OjzbI/edit?ts=5cfe7855#slide=id.g5ada30f8b8_1_1107) on June 13 and were approved for partial funding! 8 | * Updated GitHub so the Django Data Ingest Tool is officially ReVAL everywhere 9 | * Participated in webinar introducing states to the USDA/FNS Data Validation Service 10 | * Demoed ReVAL for WZDx 11 | * Complete documentation on the prototype for WZDx 12 | * Ping down all dependencies for the reusable tool 13 | * Complete adding issues for outstanding work on reusable tool 14 | * Complete closing out the documentation for the reusable tool 15 | * Prepared to transition FNS Data Validation Service to their USDA/FNS team at the end of Phase 3 16 | * Created content for [developers’ guide](https://github.com/18F/usda-fns-ingest/wiki) on GitHub in two sections 17 | * Met with FNS to discuss responsibilities and get an update on states’ progress 18 | * Submitted developer’s guide for comments from dev lab 19 | 20 | 21 | ## Up Next 22 | 23 | * Refine [Phase 4 partnership plan](https://docs.google.com/document/d/1BHmpO1yQMdO13IF4Ld-apkFwk4QBHRi6Niwl_t0tlpk/edit?userstoinvite=aaron.borden%40gsa.gov&ts=5d0a8883&actionButton=1#) 24 | * Outline more detailed staffing recommendations 25 | * Solicit 10x Phase 1 pitches from partners via mailing list 26 | * Complete Phase 3 summary as deliverable for 10x team 27 | * Update GitHub repo to clean up issues and document final state 28 | * Update federation.data.gov 29 | * Collaboration on the launch of the initial resources.data.gov site 30 | 31 | 32 | ## Questions / Blockers 33 | 34 | None at present. 35 | -------------------------------------------------------------------------------- /updates/update-10212019.md: -------------------------------------------------------------------------------- 1 | # October 21, 2019 2 | 3 | The U.S. Data Federation is back for Phase 4! Our work in Phase 4 falls into two primary workstreams: (1) supporting resources.data.gov and (2) continued work on ReVAL. Currently staffed by Julia Lindpaintner (UX Research & Design) and Mike Gintz (Strategy), the first few weeks of this phase have been focused on reestablishing contact with collaborators from Phase 3.  4 | 5 | ## Completed 6 | 7 | * Workstream 1: Resources.data.gov  8 | * Reviewed documentation on resources.data.gov website and in the related legislation and Federal Data Strategy Action Plan; revisited Phase 4 pitch materials 9 | * Held internal ramp-up conversation with data.gov team to learn about the status of work on resources.data.gov 10 | * Identified broader set of stakeholders for next direction-setting meeting 11 | * Workstream 2: ReVAL 12 | * Reestablished contact with Phase 3 partners 13 | * Christian Moscardi at Census Bureau  14 | * Nate Deshmukh Towery of the WZDx project 15 | * Explored data journey archetypes 16 | * Talked to colleagues involved in data-centric projects (with NIH, TANF, eApp) and documented the observed data journeys in order to validate an abstracted model based on previous conversations 17 | * Got up to speed on progress on the USDA/FNS Data Validation Service 18 | * Amy Mok and Tadhg O'Higgins provided invaluable support to our partners at FNS as they moved to production 19 | 20 | ## Up next 21 | 22 | * Workstream 1: Resources.data.gov  23 | * Schedule a meeting with the broader set of stakeholders in order to: 24 | * Surface participants' expectations and beliefs around r.d.g. 25 | * Identify invisible boundaries/parameters/constraints around r.d.g. 26 | * Articulate a vision of success for r.d.g. 27 | * Workstream 2: ReVAL 28 | * Identify next steps, open questions, and key dependencies for ReVAL work 29 | * Meet with FNS partners to discuss transition planning 30 | 31 | ## Challenges / Blockers  32 | 33 | * The ramp up for Phase 4 has been a bit slow as we try to orient ourselves to existing conversations and processes and find the right points of contact. We see a meeting with the broader stakeholder group engaged in the resources.data.gov work as a key first step to being able to contribute effectively and are working to get this scheduled asap. 34 | -------------------------------------------------------------------------------- /updates/update-11052019.md: -------------------------------------------------------------------------------- 1 | # November 5, 2019 2 | 3 | The past two weeks' work has been focused on laying the foundation for our collaboration with the stakeholders involved in the creation and maintenance of resources.data.gov. 4 | 5 | ## Completed 6 | 7 | * Workstream 1: Resources.data.gov repository 8 | * Coordinated, planned, and facilitated a stakeholder workshop with partners from GSA, OMB, and OGIS (Office of Government Information Services) to identify key audiences for the repository, articulate what success might look like, and better understand the team's sphere of influence in making decisions about the ongoing development and evolution of the repository 9 | * Workstream 2: ReVAL 10 | * Had intro call with Flexion to discuss ReVAL, the capabilities they could offer, and how they prefer to work with teams 11 | * Met with FNS partners to discuss transition planning and provided additional information for FNS' SOW for contractor support 12 | 13 | ## Up next 14 | 15 | * Workstream 1: Resources.data.gov repository 16 | * Create research plan for user/stakeholder interviews 17 | * Collect contact information for user/stakeholder interviews 18 | * Schedule and conduct interviews with users and stakeholders 19 | * Workstream 2: ReVAL 20 | * Outreach to Phase 3 contacts who might be able to use ReVAL 21 | * Talk to users of existing ReVAL instantiations 22 | * Draft next steps for ReVAL effort including staffing needs 23 | 24 | 25 | ## Challenges / Blockers  26 | 27 | None at present 28 | -------------------------------------------------------------------------------- /updates/update-11182019.md: -------------------------------------------------------------------------------- 1 | # November 18, 2019 2 | 3 | The past two weeks have been full of stakeholder and user interviews to learn about the data management challenges a hypothetical resource repository could meet and the audiences it could serve. 4 | 5 | ## Completed 6 | 7 | * Workstream 1: Resources.data.gov repository 8 | * Conducted interviews with 14 individuals across 7 agencies, including direct resources.data.gov stakeholders, CDOs, and other players in the federal data management and governance space 9 | * Began synthesis of research findings 10 | * Workstream 2: ReVAL 11 | * Scheduled initial conversation with a user of the WZDx ReVAL instantiation 12 | * Contributed summaries of the Data Federation and ReVAL work to the 10x Project Showcase and TTS Success Stories documents 13 | 14 | ## Up next 15 | 16 | * Workstream 1: Resources.data.gov repository 17 | * Conduct additional and follow-up conversations with stakeholders and users 18 | * Prepare a synthesis of research findings to share with r.d.g core stakeholders 19 | * Begin to define the future of the U.S. Data Federation involvement with r.d.g 20 | * Workstream 2: ReVAL 21 | * Make a determination about Amy Mok’s availability to rejoin the U.S. Data Federation project to continue work on ReVAL 22 | 23 | 24 | ## Challenges / Blockers  25 | 26 | The combination of a focus skewed towards research around the repository and slow responses from our ReVAL user contacts led to little progress on Workstream 2 over the course of the past two weeks. The two work streams continue to compete for attention in Phase 4, but as long as we have a hope of Amy returning to the project, we are less inclined to push on the ReVAL line of work. 27 | -------------------------------------------------------------------------------- /updates/update-11302018.md: -------------------------------------------------------------------------------- 1 | ## November 30, 2018 2 | 3 | The team has used the past two weeks as a Sprint 0, to get oriented to work already done in Phases 1 and 2 and to better understand the goals for Phase 3. 4 | 5 | These weekly ship reports will detail the work of the team for stakeholders and other interested parties. 6 | 7 | ### Completed 8 | * Reviewed information and assets from Phase 1 and 2 9 | * Granted access to team to all relevant GH repos and renamed project repo “Data Federation Project” 10 | * Reviewed initial data ingest prototype code and open issues in GH repos 11 | * Held kick off meeting with project sponsor (Phil Ashlock) 12 | * Connected with team members from Phase 1 and 2 13 | - Tony Garvan, who conducted the original research for Phase 1 14 | - Joe Krzystan, who demoed the django-data-ingest prototype 15 | - Chris Goranson, UX Designer on Phase 2 16 | * Held introductory meeting with USDA/FNS to discuss initial pilot 17 | * Held introductory meeting with State of Kansas, to begin exploring how their system can integrate with the data validation pilot 18 | 19 | ### Up Next 20 | * Follow up meeting with FNS to get better oriented to school lunch program details 21 | * Follow up meetings with Kansas to talk about their technology solution, and to interview sponsors (who use the state solution to submit data) 22 | * Set up introductory meeting with Montana, another state that could act potential test case for the data ingest prototype 23 | * Develop recruitment materials for other agencies that could act as potential test cases 24 | * Planning for Sprint 1, Monday 12/4 25 | 26 | ### Questions / blockers 27 | * None 28 | -------------------------------------------------------------------------------- /updates/update-12092019.md: -------------------------------------------------------------------------------- 1 | # December 9, 2019 2 | 3 | Over the past few weeks we have continued to conduct research and learn more about the relationship between the Evidence Act, the Federal Data Strategy, and other ongoing efforts around data sharing. We are finishing up our last scheduled user and stakeholder interviews this week, synthesize our findings, and articulate some tangible recommendations for next steps in Phase 4. 4 | ## Completed 5 | 6 | * Workstream 1: Resources.data.gov repository 7 | * By the end of this week, we will have spoken with more than 30 individuals, including direct resources.data.gov stakeholders, CDOs, those involved in the Federal Data Strategy and other players in the federal data management and governance space 8 | * Continued synthesis and began to formulate a vision for the evolution of the resource repository 9 | * Began to evaluate possible long-term sustainment strategies 10 | * Workstream 2: ReVAL 11 | * Spoke with a user of the WZDx ReVAL instantiation and advocated for funding to be specifically allocated to update the tool to the forthcoming v.2.0 of the specification 12 | * Determined that Amy Mok would not be able to continue on Data Federation until the new year at the earliest 13 | 14 | ## Up next 15 | 16 | * Workstream 1: Resources.data.gov repository 17 | * Report back on research conducted to resources.data.gov stakeholders 18 | * Present a proposal for next steps with 10x funding 19 | * Learn more about potential funding mechanisms for this work within OMB and GSA 20 | * Workstream 2: ReVAL 21 | * Follow up with FNS team to learn more about how the Data Validation Service (DVS) has been going in production 22 | * Schedule conversations with the states who implemented the DVS this fall to capture their experiences 23 | * Determine what (if any) engineering resources will be needed in 2020 24 | 25 | ## Challenges / Blockers  26 | 27 | The busy pre-holiday and pre-Federal Data Strategy Year One Action Plan launch schedules may make it hard to convene the resources.data.gov stakeholders in one meeting, potentially delaying our ability to get buy-in on next steps for our team. 28 | -------------------------------------------------------------------------------- /updates/update-12142018.md: -------------------------------------------------------------------------------- 1 | ## December 14, 2018 2 | 3 | Today marks the end of Sprint 1! The team has spent the last two weeks working towards both the overarching goal of developing resources to support federated data efforts in general and the implementation of the data validation tool with USDA’s Food & Nutrition Service. 4 | 5 | ### Completed 6 | - Completed review of data-ingest-tool and wrote Demo Notes 7 | - Held follow up meeting with FNS to get better oriented to school lunch program details 8 | - Documented the current end-to-end process 9 | - Conducted follow-up interview with Kansas to learn about their technology solution and demo our data validation tool 10 | - Participated in conversations to socialize the project with relevant stakeholders 11 | - Interagency Open Data Working Group 12 | - Hudson Hollister of the Data Coalition 13 | - Created preliminary promotional materials for outreach 14 | - Refined issues in GitHub in preparation for Sprint 2 15 | 16 | ### Up next 17 | - Connect with and interview sponsors and regional consultants in Kansas (who use the state solution to submit data) 18 | - Hold introductory meeting with Montana, another state that could act potential test case for the data ingest prototype (scheduled for Thursday, 12/20/18) 19 | - Circulate promotional materials with relevant stakeholder communities to identify potential additional use cases 20 | 21 | ### Questions / Blockers 22 | - We are debating the benefits of possibly moving the relevant Github repos for this project from the 18F organization to the GSA organization. It would be good to know if there were other 10X projects thinking about this, and where they are in this process. 23 | 24 | -------------------------------------------------------------------------------- /updates/update-12232019.md: -------------------------------------------------------------------------------- 1 | # December 23, 2019 2 | 3 | Throughout this project, we’ve had a clear mission statement for the U.S. Data Federation: an effort to support reusable tools and repeatable processes for federated data projects. Last week, we presented a vision for the long-term practical manifestation of the U.S. Data Federation: The content strategy team underpinning resources.data.gov. We look forward to using the remainder of Phase 4 funding to prototype the team, content sourcing, and content structure for the resource repository. 4 | 5 | ## Completed 6 | 7 | * Submitted [staffing request](https://github.com/18F/staffing/issues/689) for full-time content strategy and engineering capabilities to be added to the team in early 2020 8 | * Workstream 1: Resources.data.gov repository 9 | * Conducted a few more interviews and investigated potential funding opportunities for the U.S. Data Federation beyond Phase 4 10 | * Presented to resources.data.gov and 10x stakeholders: Reported back on research conducted, described vision long-term for the U.S. Data Federation, and outlined next steps for the remainder of Phase 4 11 | * Workstream 2: ReVAL 12 | * Followed up with pilot partners 13 | * FNS team needs our help making sure that states are not mistakenly accessing the 18F/10x sandbox for the Data Validation Service 14 | * WZDx is planning the publication of their v2 specification and seeking internal resources to update the validation tool 15 | * Census partners are continuing to work on the tool for the Commodity Flow Survey and have submitted a proposal for an additional project that would build on top of ReVAL 16 | 17 | 18 | ## Up next 19 | 20 | * Plan to onboard new team members 21 | * Define project tasks for remainder of Phase 4 22 | * Work with 10x team to define timeline and scope of work for Flexion contractors 23 | * Schedule follow-up conversation with OMB stakeholders to address opportunities to collaborate with CDO Council and Federal Data Strategy 24 | * Schedule conversations with the states who implemented the USDA FNS Data Validation Service (DVS) this fall to capture their experiences 25 | 26 | 27 | ## Challenges / Blockers  28 | 29 | The 10x Data Fed team is mostly on holiday break until 1/6/2020. Happy New Year! 30 | --------------------------------------------------------------------------------